Learn to scrape, parse, analyze, and visualize data for exploratory analysis and quantitative research. No previous programming knowledge is assumed.
This is an for-credit applied course focusing on a pragmatic understanding of programming languages and software libraries, specifically oriented towards students in the social sciences and humanities with emerging research projects requiring basic programming skills.
Students in the course will learn to write programs in the (open-source) interpreted programming language Python, as well as learning to use databases and to interact with a wide variety of existing software libraries. The course's goals are to demonstrate that data can be created, analyzed and visualized by a diversity of methods, and to encourage students not to be intimidated by unfamiliar computer programming dialects and interfaces. The course will introduce methods required to parse text files, scrape data from other sources, write structured programs for statistical analysis, create and query databases, simulate social processes, visualize datasets, conduct network analysis, and assemble multiple processes into software "pipelines". As such, the course's goals include unshackling academic researchers from the constraints of commercial, general-purpose statistics/GIS software and to free them from the limitations of working with pre-existing and pre-formatted data sets.
Each week's lectures will be accompanied by a take-home programming assignment which will be due before the following week's class on Tuesday. Weekly tutoring hours will be provided for those requiring extra guidance on the assignment. The programming assignments will often be cumulative and build on one another, so completing the functionality of each assignment is crucial.
This year we will be using Piazza for class discussion. Rather than emailing questions to the teaching staff directly, we encourage you to post your questions there.
The course grade will be composed of three parts: participation (10%); homework (50%); and a final project (40%). Participation will be primarily evaluated based on your participation in Piazza where you can ask and answer questions about anything course-related. Homework will be given full credit if completed on-time, will be discounted 10% each class period that they remain incomplete. If you do not complete the assignment on-time, you will need to inform us to look at it again for re-evaluation. If you are having a hard time...let us know on Piazza or in person at any of the study sessions. We hope that you will support each other in solving problems associated with the homework, but each student must understand, complete and "turn in" their own. Undergraduate student final projects will involve creation of a flexible, interactive, web-presentable project that either performs (1) network analysis on Facebook data; (2) content analysis on Twitter data; or (3) simulation analysis on city portal (e.g., crime) data. Graduate student final projects will be flexible, interactive, web-presentable projects of their own design that solve a substantial research problem. (Undergraduates can petition to perform a "graduate" project).
|Programming, Python Fundamentals and an Introduction to Computational Social Science|
|Code Reuse and Tuning|
|Week 3 |
|Data and Information Visualization|
|High performance computing|
|Final (March 19th)||Hacker Fair|
1 Students will be required to install Python distributions and other software on their own computers, in a Week 3 "installathon".