In this article I will be demystifying why some of the co-efficients of the predictor variables in Lasso regression will be equal to zero while in Ridge regression none of the co-efficients of the predictor variables will approach zero(but will not be equal to zero).
If you are new to Ridge and Lasso Linear regression concepts you can check out this article about regression on Analytics Vidhya.
In short Lasso and Ridge regression are used in case there is multicollinearity and to reduce overfitting , so that some of the co-efficients will die down to zero(Lasso) or will approach zero(Ridge).
But after checking out the cost functions for Lasso and Ridge Regression you may get doubt why one cost function reduces the co-efficients to zero and other approaches to zero.
Cost Function for Lasso Regression:-
Let’s say minimising term as A and constraint term as B.
Cost Function for Ridge Regression:-
Let’s say constraint term as C and as said earlier minimising term as A.
So we need to minimise least squares subject to some constraints in Lasso and Ridge Regression. Let us solve this using some graphs.
Let us use two predictor variables, beta1 and beta2 are their co-efficients in regression.
B follows a diamond shaped graph and C follows a circle shaped graph, minimising term A follows a ellipse.
Graph of cost function for Lasso Regression(Figure1)
Graph of cost function for Ridge Regression(Figure2)
As you can see in Lasso the minimising function and constraint can intersect on any one of the axes making their co-efficient values zero while in Ridge the minimising function and constraint cannot intersect on any one of the axes so the co-efficient values will approach zero but will not be equal to zero.