R Programming By Example
上QQ阅读APP看书,第一时间看更新

Predicting Votes with Linear Models

This chapter shows how to work with statistical models using R. It shows how to check data assumptions, specify linear models, make predictions, and measure predictive accuracy. It also shows how to find good models programatically to avoid doing analysis by hand, which can potentially save a lot of time. By the end of this chapter, we will have worked with various quantitative tools that are used in many business and research areas nowadays. The packages used in this chapter are the same ones from the previous chapter.

Just like in the previous chapter, the focus here will be on automating the analysis programatically rather than on deeply understanding the statistical techniques used in the chapter. Furthermore, since we have seen in Chapter 2, Understanding Votes With Descriptive Statistics, how to work efficiently with functions, we will use that approach directly in this chapter, meaning that when possible we'll work directly with functions that will be used to automate our analysis. We will cover the following:

  • Splitting data into training and testing sets
  • Creating linear regression models used for prediction
  • Checking model assumptions with various techniques
  • Measuring predictive accuracy for numerical and categorical data
  • Programatically finding the best possible model