Machine Learning- Generalization

Home / Machine Learning – Tutorial / Machine Learning- Generalization

Generalization

Generalization is a term usually refers to a Machine Learning models ability to perform well on the new unseen data. After being trained on a training set, a model can digest new data and can able to make accurate predictions. The main success of the model is the ability of the model to generalize well. If the model has been trained too well on the training data, it will be difficult for the model to generalize.

Generalization is strongly related to the concept of overfitting. If the model is overfitted, then it will not generalize well. It will make inaccurate predictions when new data is given which makes the model useless even though it is able to make correct predictions for the training data. This is called as overfitting, whereas the inverse is also possible.

When the model has not been trained enough on the data leads to underfitting problem. In the case of underfitting, it makes the model useless and incapable of making accurate predictions even with training data.

Generalization 1 (i2tutorials)

If the model is over trained on the data, then it will be able to discover all the relevant information in the training data, but will fail miserably when the new data is introduced. By this we can say that the model is not capable of generalizing which also means that the training data is over trained.

Generalization 2 (i2tutorials)

We may think that if we train longer, then we model will be better. It may be true, but it is better only at describing the training data. To create better predictive models in machine learning which are capable of generalizing. One should know when to stop training the model so that it does not overfit.