Bias-variance Tradeoff

Often people get confused with the bias-variance concept.So rather than going by the definitions of bias-variance tradeoff concept here I try to explain them in a general  way.

The main goal of supervised learning is prediction.

Prediction error consists of irreducible error and reducible error.

Prediction error = Irreducible error + Reducible error.

Irreducible error is due to noise in data set,we cannot minimise the irreducible error.

Reducible error is due to Bias error and Variance error.

We can minimise the reducible error to minimise the prediction error.

Bias Error:

Bias error is due to restrictions/assumptions we make while building a model.

Let’s take the example in which the training data is quadratic in nature as shown in figure.(Kindly ignore the X and Y labels,just assume that our training data set is spread over as shown)


If we assume that the model is linear then it does not fit the training data set exactly and the error due to this restriction is known as Bias error.

So,if there are high restrictions lead to high bias error.

Variance Error:

Error due to variance is error due to sampling of training data set.

Let us once again refer to the above shown figure,if we model with few restrictions and model the quadratic data with a polynomial then the variance is high,low bias.Since if we use this model for prediction on testing data then there will be high variance.

Bias-variance tradeoff:

Its clear now low bias which mean few restrictions on model will give you high variance,on the contrary more restrictions on model which lead to high bias and low variance.This is called Bias-variance tradeoff.

We call a model with high variance overfitting and with low bias underfitting.

Since if we go with high variance the model fits absolutely for training data but fails to fit for test data which leads to overfitting that means it is almost memorising the training data set,on the other side if we go with low bias the model neither fits absolutely for training data nor test data which leads to underfitting.

Hope you got clarity on this concept.

Post your questions in the comment section if you have any.

Happy learning 🙂