Saturday, June 16, 2018

Bias-Variance Tradeoff

Bias-Variance Tradeoff

Actual model:

where is the irreducible error.

Estimation:

We approximate with

Bias and Variance of the Estimation:


Here, , since is deterministic.


MSE of estimation as a function of bias and variance of estimation


Example of bias-variance tradeoff


  • Overfitting is low bias, high variance
  • Underfitting is high bias, low variance

  • Red Line: Bias is high. Variance is low.
  • Blue Line: Bias is low. Variance is high.


Properties: Model Complexity

  • The more complex the model, the lower the bias.
  • The more complex the mode, the higher the variance.


Properties: Regularization

  • The lower the , the lower the bias, the higher the variance.
  • The higher the , the higher the bias, the lower the variance.


Properties: Number of Samples

  • Increasing sample size will decreasing variance.
  • If a learning algorithm is suffering from high bias (under-fit), getting more training data will not (by itself) help much.

  • If a learning algorithm is suffering from high variance, getting more training data is likely to help.

Summary

  • Getting more training examples fixes high variance (overfit model, use more training example)
  • Using smaller sets of features fixes high variance (overfit model, use smaller number of features)
  • Getting additional feature fixes high bias (underfit model, get more features)
  • Adding polynomial features (i.e. more complex model) fixes high variance (underfit model, increase model complexity)
  • Increasing fixes high variance (overfit model, increase )
  • Decreasing fixes high bias (underfit model, decrease )

So, for overfit model (low bias, high variance):

  • Increase sample size
  • Reduce number of features
  • Increase regularization

For underfit model (high bias, low variance):

  • Get more features
  • Increase model complexity
  • Decrease regularization

No comments:

Post a Comment