上QQ阅读APP看书，第一时间看更新

Imputing missing values through various strategies

Data imputation is critical in practice, and thankfully there are many ways to deal with it. In this recipe, we'll look at a few of the strategies. However, be aware that there might be other approaches that fit your situation better.

This means scikit-learn comes with the ability to perform fairly common imputations; it will simply apply some transformations to the existing data and fill the NAs. However, if the dataset is missing data, and there's a known reason for this missing data—for example, response times for a server that times out after 100 ms—it might be better to take a statistical approach through other packages, such as the Bayesian treatment via PyMC, hazards models via Lifelines, or something home-grown.