cm012 - May 2, 2018

Overview

• Define a decision tree
• Demonstrate how to estimate a decision tree
• Define and estimate a random forest
• Introduce the caret package for statistical learning in R
• Define resampling method
• Compare and contrast the validation set approach with leave-one-out and $$k$$-fold cross-validation
• Demonstrate how to conduct cross-validation using modelr

Before class

• Read chapters , 8.1, 8.2.2, and 5.1 in An Introduction to Statistical Learning if you want a rigorous introduction to the mathematics behind logistic regression, decision trees, and random forests. In class we will briefly summarize how these methods work and spend the bulk of our time on estimating and interpreting these models