What are the major types of different Regression methods in Machine Learning?

DIFFERENT REGRESSION METHODS IN MACHINE LEARNING 1 (i2tutorials)

Top Machine learning interview questions and answers

November 13, 2019

What are the major types of different Regression methods in Machine Learning?

Regression Analysis is all about to find the relationship between Dependent Variables and Independent Variables. There are different kind of Regression techniques based upon the factors or metrics like number of independent variables, type of dependent variable and shape of Regression line. There are 7 most commonly used methods of regression Analysis or methods in machine Learning. They are

Linear Regression
Logistic Regression
Polynomial Regression
Step wise Regression
Ridge Regression
Lasso Regression
Elastic Net Regression

Linear Regression

Let’s understand each type of Regression briefly

It is one of the most widely used technique in regressions models. Generally, in Linear Regression, Dependent variable is continuous where as Independent Variable may be continuous or discrete. It establishes the relationship between one dependent variable and one or more independent variable using best fit straight line which is also termed as Linear.

Linear Regression is usually represented by the mathematical equation Y=a + b*X + e.

Where a is the intercept, b is slope of the line and e is the error. This equation is used to predict the value of target variable or dependent variable based on predictor variable or independent variable. Linear regression also has two types which are simple linear regression and multi linear regression.

Logistic Regression

Simple Linear Regression finds the relation ship between single dependent and Independent variables. Where as Multi linear regression establishes the relationship between multiple Independent variables and Dependent variable.

Logistic regression is used to find the probability of event=Success and Failure. Logistic regression is used when the dependent variable is binary in nature. The value of Y ranges from 0 to 1.

Polynomial Regression

Logistic regression doesn’t require linear relationship between dependent and independent variables. It can handle various types of relationships as it applies a non-linear log transformation to the predicted odds ratio.

Polynomial Regression is termed as polynomial as the power of independent variable is more than or greater than one.

y=a+ b*x^2

Step wise Regression

In the Polynomial Regression, the best fit line may not be a straight line, it can be a curve line which can fit the model perfectly.

This type of Regression is used when we have multiple independent variables. In step wise Regression, the selection of independent variable is done automatically, any human intervention is not required. Stepwise regression basically fits the regression model by adding/dropping co-variates one at a time based on a specified criterion.

Stepwise regression adds and removes predictors or independent variables as needed for each step.
Forward selection starts with most significant predictor in the model and adds variable for each step.
Backward elimination starts with all predictors in the model and removes the least significant variable for each step.

Ridge Regression

The aim of the step wise regression technique is to maximize the prediction power with minimum number of predictor variables.

Ridge Regression is used when the model is suffering from multicollinearity which means independent variable are highly collinearity. In multicollinearity, even though the least squares estimates are unbiased, their variances are large which leads to the deviation of observed value far from the true value. By adding a degree of bias to the regression estimates, ridge regression reduces the standard errors.

Y= a+ b*x

Lasso Regression

In a linear equation, prediction errors can be decomposed into two sub components. First is due to the biased and second is due to the variance. Prediction error can occur due to any one of these two or both components.

Lasso is another variation, in which the above function is minimized. It is clear that this variation differs from ridge regression only in penalizing the high coefficients. It uses |βj|(modulus)instead of squares of β, as its penalty. In statistics, this isknown as the L1 norm.

Consider there are 2 parameters in a given problem. Then according to above formulation, the ridge regression is expressed by β1² + β2² ≤ s. This implies that ridge regression coefficients have the smallest RSS (loss function) for all points that lie within the circle given by β1² + β2² ≤ s.

Elastic Net Regression

Similarly, for lasso, the equation becomes, |β1|+|β2|≤ s. This implies that lasso coefficients have the smallest RSS (loss function) for all points that lie within the diamond given by |β1|+|β2|≤ s.

Elastic Net is hybrid of Lasso and Ridge Regression techniques. It is trained with L1 and L2 prior as regularizes. Elastic-net is useful when there are multiple features which are correlated. Lasso is likely to pick one of these at random, while elastic-net is likely to pick both.

A practical advantage of trading-off between Lasso and Ridge is that, it allows Elastic-Net to inherit some of Ridge’s stability under rotation.