0% found this document useful (0 votes)
15 views

cost function

Uploaded by

Soumia Sandjak
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

cost function

Uploaded by

Soumia Sandjak
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

A cost function in machine learning is a mechanism that returns the error between predicted

outcomes and the actual outcomes[1]. In the context of neural networks, the cost function is used to
measure the difference between the actual output and the predicted output of the network. The cost
function is optimized during the training process to minimize the error and improve the accuracy of the
network.

In neural networks, the cost function is typically defined as the sum of errors in each layer. Each layer will
have a cost function, and each cost function will have its own least minimum error value. The goal is to
find the minimum value out of all local minima, which is called the global minima[1].

The cost function for neural networks is given as:

Cost function for Neural Networks

Gradient descent is used to optimize the cost function by adjusting the weights and biases of the
network. The gradient of the cost function is calculated with respect to the weights and biases, and the
weights and biases are updated in the opposite direction of the gradient[3].

The choice of cost function in neural networks is crucial as it directly impacts the training process and
the model's ability to learn from data. The appropriate cost function for neural networks depends on the
specific task and the nature of the output[4]. For example, mean squared error (MSE) is suitable for
regression tasks, while categorical cross-entropy loss is suitable for multi-class classification tasks.

In summary, a cost function in machine learning is a mechanism that returns the error between
predicted outcomes and the actual outcomes. In the context of neural networks, the cost function is
used to measure the difference between the actual output and the predicted output of the network. The
choice of cost function is crucial as it directly impacts the training process and the model's ability to
learn from data.

Citations:

[1] https://www.simplilearn.com/tutorials/machine-learning-tutorial/cost-function-in-machine-learning

[2] https://www.youtube.com/watch?v=NJpABYQB9PI

[3] https://stats.stackexchange.com/questions/154879/a-list-of-cost-functions-used-in-neural-networks-
alongside-applications
[4] https://www.geeksforgeeks.org/neural-networks-which-cost-function-to-use/

[5] https://www.javatpoint.com/cost-function-in-machine-learning

The Mean Squared Error (MSE) is a loss function used for regression problems. It
represents the difference between the original and predicted values extracted by squaring the average
difference over the dataset. The MSE is sensitive towards outliers and given several examples with the
same input feature values, the optimal prediction will be their mean target value. This should be
compared with Mean Absolute Error, where the optimal prediction is the median. MSE is thus good to
use if you believe that your target data, conditioned on the input, is normally distributed around a mean
value, and when it’s important to penalize outliers extra much.

The formula for MSE is:

MSE = (1/n) * Σ(actual – forecast)2

where:

- n is the sample size

- actual is the actual data value

- forecast is the predicted data value

MSE is calculated by taking the average, specifically the mean, of errors squared from data as it relates to
a function. A larger MSE indicates that the data points are dispersed widely around its central moment
(mean), whereas a smaller MSE suggests the opposite. A smaller MSE is preferred because it indicates
that your data points are dispersed closely around its central moment (mean). It reflects the centralized
distribution of your data values, the fact that it is not skewed, and, most importantly, it has fewer errors
(errors measured by the dispersion of the data points from its mean). Lesser the MSE => Smaller is the
error => Better the estimator.

MSE is used in various applications, such as predicting future house prices, image reconstruction, and
demand forecasting. It is a widely used loss function in neural networks and is used to measure the
difference between the actual output and the predicted output of the network. The cost function is
optimized during the training process to minimize the error and improve the accuracy of the network.

Citations:
[1] https://www.theaidream.com/post/loss-functions-in-neural-networks

[2] https://builtin.com/machine-learning/loss-functions

[3] https://stats.stackexchange.com/questions/313070/mse-formula-in-neural-network-applications

[4] https://www.geeksforgeeks.org/neural-networks-which-cost-function-to-use/

[5] https://www.simplilearn.com/tutorials/statistics-tutorial/mean-squared-error

You might also like