Loss Function in TensorFlow
Generally In machine learning models, we are going to predict a value given a set of inputs. The model has a set of weights and biases that you can tune based on a set of input data. The training data has several pairs of predicted and actual values.For this we use a loss function to determine how far the predicted values are deviate from the actual values in the training data. We can update or change the model weights to make the loss minimum.
We are having different types of loss functions.
=>> Regression Losses.
=>> Classification losses
In Regression Losses,
1.Mean squared error.
2. Meanabsolute error
3.Mean bias error.
In Classification Losses,
1.Hinge or svm loss
2.Cross Entropy loss.
Mean squared error:
Mathematical formulation :-
As the name suggests, Mean square error is used to measure the average of squared difference between predictions and actual observations. It’s only consider the average magnitude of error irrespective of their direction..
Mean absolute error:
Mathematical formulation :-
Mean absolute error, is measured as the average of sum of absolute differences between predictions and actual observations. It is very similer to MSE, measures the magnitude of error without considering their direction. The MAE is more robust to outliers so it does not make use of square.
Mean bias error:
Mathematical formulation :-
The Mean bias error is very much less common in machine learning applications.This is also same as MSE but the only difference is that we don’t take absolute values.so here to take a caution because the possitive and and negitive errors are cancel each other.this will effects the accuracy.
Hinge loss or Svm loss:
Mathematical formulation :-
Hinge loss is used for the high maximum-margin classification which is nthing but the support vector machines.simply we can say that the score of correct category should be greater than sum of scores of all wrong categories by some safety margin.
Cross-Entropy loss:
Mathematical formulation :-
This is the most widely used in classification problems.it increses the predicted probabality.
In tensorflow the similar loss function is l2_loss function
For example:
We are having the actual values are
actual = tf.constant([[1, 2], [3, 4]], dtype = tf.float32)
and the predicted values are
predicted = tf.constant([[0, 1], [4, 5]], dtype = tf.float32)
Now, the loss is
Sqrt(1-0)+sqrt(2-1)+sqrt(3-4)+sqrt(4-5)/2=2.
So here the loss is 2.
Now,going to another loss function is Cross-entropy. This is the most freequently used in tensorflow. Here we are taking some imaginary values again for the actual and predicted and we will use numpy for the mathmatical calculations.in this the calculations are difficult by manually for this we are using the softmax_function by calculating the values by numpy and after cross checking the tensorflow results.
Softmax is used to convert the unnormalized to normalized and it into a probability distribution.
Now, let us take a 2 examples
Once ,you can observe the above code the smaller value converted to smaller probabiliy and larger value converted into higher probbility. So finally the softmax values are add upto 1.
Now, we are going to calculate the cross entropy value for single elements (an element is a scalar value), The first parameter is the actual value and the second element is the predicted value.
the actual value and predicted values are in vectors or lists then we can calculate the cross entropy by taken the mean of the element wise cross entropy values.
Normally the probabilities are either 0 or 1. The predicted softmax probabilities are observed earlier.these predicted values are calculated in previous layer of neural network.now, apply the softmax to logits. Here the logits are calculate values in neural networks before applying the softmax.
We are not applying the softmax the actual values since they are already probabilities.
Now, we will calculate the same values using tensorflow functions.
Here the predicted values are pass to the before softmax as the tensorflow functions and calculate the softmax and cross-entropy. And observe the loss function it is a single element and observe the loss and mean of cross-entropy.
We will understand the loss function and tensorflow implementation to build our own neural network.









