Hands-On Python Deep Learning for the Web
上QQ阅读APP看书,第一时间看更新

Bias and variance

Bias and variance are very intrinsic to any ML model. Having a good understanding of them really helps in the further assessment of the models. The trade-off between the two is actually used by the practitioners to assess the performance of machine learning systems.

You are encouraged to see this lecture by Andrew Ng to learn more about this trade-off, at https://www.youtube.com/watch?v=fDQkUN9yw44&t=293s.

Bias is the set of assumptions that an ML algorithm makes to learn the representations underlying the given data. When the bias is high, it means that the corresponding algorithm is making more assumptions about the data and in the case of low bias, an algorithm makes as little an amount of assumptions as possible. An ML model is said to have a low bias when it performs well on the train set. Some examples of low-bias ML algorithms are k-nearest neighbors and support vector machines while algorithms such as logistic regression and naive Bayes are generally high-bias algorithms

Variance in an ML context concerns the information present in the data. Therefore, high variance refers to the quality of how well an ML model has been able to capture the overall information present in the data given to it. Low variance conveys just the opposite. Algorithms such as support vector machines are generally high on variance and algorithms such as naive Bayes are low on variance.