Multiple Linear Regression
Multiple linear regression in short can also be termed as MLR, simply it is referred as multiple regression. It is a statistical technique which uses various explanatory variables to predict the outcome of a response variable (dependent variable). Multiple linear regression (MLR) is used to express the linear relationship between the independent variables (explanatory variable) and dependent variable (Response variable).
Multiple linear Regression can be calculated by using the following metric
for i=n observations
β0=y-intercept (constant term)
βp=slope coefficients for each explanatory variable
ϵ=the model’s error term (residual)
The multiple regression model is based on some assumptions which are given below:
- There may be a linear relationship between the independent variables and dependent variable.
- The independent variables should not be too highly correlated with each other.
- yi observations are selected independently and randomly from the dataset.
- Residuals or error terms should be normally distributed with a mean of 0 and variance σ.
- The residuals are homoscedastic in nature and approximately rectangular-shaped.
Coefficient of Determination
The coefficient of determination is also termed as R-squared is a statistical metric which is used to calculate the variation in outcome can be determined and explained by the variation in the independent variables.
R2 always increases as a greater number of predictors are added to the Multi linear regression model even though the predictors may not be related to the outcome variable.
R2 itself cannot be used to identify which predictors should be included in a model and which cannot be included.
Range of R2 lies in between 0 and 1, where 0 indicates the outcome which cannot be predicted by any of the independent variables and 1 indicates the outcome which can be predicted without error from the independent variables.
When determining the results of a multiple regression, beta coefficients are valid only while holding all other variables constant which means all variables are equal. The multiple regression analysis output can be displayed horizontally as an equation, or vertically in table form.
The Difference Between Linear and Multiple Regression
- Simple Linear Regression gives the relationship between one Independent variable and one Dependent variable. Where as Multi Linear Regression gives the relationship between various Independent variables and one dependent variable.
- Simple Linear Regression does not suffer from Multicollinearity, where as Multi Linear Regression suffers from Multi collinearity which needs to be reduced.
- Simple Linear Regression is Linear in nature but, Multi Linear regression need not to be linear. It may be Linear or Non-Linear.
- There is no correlation between variables in Simple Linear Regression. Where there exists correlation between independent variables in Multi Linear Regression which leads to Multi Collinearity.