/  Ridge Regression   /  Ridge Regression in Machine Learning

Ridge Regression in Machine Learning

The Ridge Regression is a regularization technique or in simple words it is a variation of Linear Regression. This is one of the method of regularization technique which the data suffers from multicollinearity. In this multicollinearity ,the least squares are unbiased and the variance is large and which deviates the predicted value from the actual value. In this equation also have an error term.

                                                                    Y=mx+c+error term

Prediction errors are occurred due to  bias and variance in this the multicollinearity are reduced by using lamda function.

In Ridge Regression there is no feature selection and it shrinks the value but never reaches to zero. It is also called as L2 regularization technique. Let’s take a dataset and perform the ridge regression ,

The implementation of Ridge regression is:

In this dataset we have 5 columns in which 4 are independent variables and 1 dependent variable.

Import the necessary libraries numpy , pandas , and matplotlib

Import numpy as np

Import pandas as pd

Import  matplotlib.pyplot as plt


Import  sklearn

We can load the data by using read_csv function.

Data=pd.read_csv(“D:\\ML CSV\\classi\\50_Startups.csv”)

Data=pd.read_csv(“path of data.name of data.csv)



R&D SpendAdminstrationMarketing SpendStateProfit

By using head function, we can print the first five rows of our data.

Now, we can separate the features and target using iloc function





Here we can train our model by using train_test_split method. We can import this function by sklearn.

from sklearn.model_selection import train_test_split


After training we have to fit to our model i.e Ridge regression , for this we can import our model some required parameters mean squared error , r2_score etc.

from sklearn.linear_model import Ridge

from sklearn.metrics import mean_squared_error,r2_score





Ridge(alpha=0, copy_X=True, fit_intercept=True, max_iter=None, normalize=True,random_state=None, solver=’auto’, tol=0.001)

After training, we can predict the test values.




array([192261.83, 191792.06, 191050.39, 182901.99, 166187.94, 156991.12,       156122.51, 155752.6 , 152211.77, 149759.96, 146121.95, 144259.4 ,       141585.52, 134307.35, 132602.65, 129917.04, 126992.93, 125370.37,       124266.9 , 122776.86, 118474.03, 111313.02, 110352.25, 108733.99,       108552.04, 107404.34, 105733.54, 105008.31, 103282.38, 101004.64,        99937.59,  97483.56,  97427.84,  96778.92,  96712.8 ,  96479.51,        90708.19,  89949.14,  81229.06,  81005.76,  78239.91,  77798.83,        71498.49,  69758.98,  65200.33,  64926.08,  49490.75,  42559.73,        35673.41,  14681.4 ])

After predicting, we can check the mean_squared_error and r2_score using metrics functions.




     Mean squared error :  77506468.16885379 

      R2_score           :  0.9393955917820573

Leave a comment