Getting data from the web: scraping


Date
Event
Location
Room 104, Stuart Hall, Chicago, IL

Overview

  • Define HTML and CSS selectors
  • Introduce the rvest package
  • Demonstrate how to extract information from HTML pages
  • Demonstrate how to extract tables and convert to data frames
  • Practice scraping data

Before class

Class materials

  • Web scraping
  • rvest
    • Load the library (library(rvest))
    • demo("tripadvisor") - scraping a Trip Advisor page
    • demo("united") - how to scrape a web page which requires a login
    • Scraping IMDB

What you need to do

Avatar
Benjamin Soltoff
Assistant Instructional Professor in Computational Social Science