Statistical learning

Jul 13, 2020 9:30 AM


  • Review the major goals of statistical learning
  • Explain the difference between parametric and non-parametric methods
  • Introduce linear models and ordinary least squares regression
  • Demonstrate how to estimate a linear model in R using lm()
  • Demonstrate how to extract model statistics using broom and modelr
  • Practice estimating and interpreting linear models
  • Demonstrate the use of logistic regression for classification
  • Identify methods for assessing classification model accuracy

Before class

  • Read chapters 22-25 in R for Data Science
  • This is not a math/stats class. In class we will briefly summarize how these methods work and spend the bulk of our time on estimating and interpreting these models. That said, you should have some understanding of the mathematical underpinnings of statistical learning methods prior to implementing them yourselves. See below for some recommended readings:
For those with little/no statistics training
  • Chapters 7-8 of OpenIntro Statistics - an open-source statistics textbook written at the level of an introductory undergraduate course on statistics
For those with prior statistics training
  • Chapters 2-3, 4.1-3 in An Introduction to Statistical Learning - a book on statistical learning written at the level of an advanced undergraduate/master’s level course
  • Chapters 4-5 in Hands-On Machine Learning with R - a recent publication which approaches these methods from the perspective of machine learning rather than traditional statistical inference. Includes code examples using R and the caret package.

Class materials

Additional readings

What you need to do after class

Benjamin Soltoff
Assistant Instructional Professor in Computational Social Science