Actual model:
where is the irreducible error.
Estimation:
We approximate with
Bias and Variance of the Estimation:
Here, , since is deterministic.
MSE of estimation as a function of bias and variance of estimation
Example of bias-variance tradeoff
- Overfitting is low bias, high variance
- Underfitting is high bias, low variance
- Red Line: Bias is high. Variance is low.
- Blue Line: Bias is low. Variance is high.
Properties: Model Complexity
- The more complex the model, the lower the bias.
- The more complex the mode, the higher the variance.
Properties: Regularization
- The lower the , the lower the bias, the higher the variance.
- The higher the , the higher the bias, the lower the variance.
Properties: Number of Samples
- Increasing sample size will decreasing variance.
- If a learning algorithm is suffering from high bias (under-fit), getting more training data will not (by itself) help much.
- If a learning algorithm is suffering from high variance, getting more training data is likely to help.
Summary
- Getting more training examples fixes high variance (overfit model, use more training example)
- Using smaller sets of features fixes high variance (overfit model, use smaller number of features)
- Getting additional feature fixes high bias (underfit model, get more features)
- Adding polynomial features (i.e. more complex model) fixes high variance (underfit model, increase model complexity)
- Increasing fixes high variance (overfit model, increase )
- Decreasing fixes high bias (underfit model, decrease )
So, for overfit model (low bias, high variance):
- Increase sample size
- Reduce number of features
- Increase regularization
For underfit model (high bias, low variance):
- Get more features
- Increase model complexity
- Decrease regularization
No comments:
Post a Comment