PSC 400: Data Analytics for Political Science

CLASS INFORMATION
Tuesday and Thursday, 11:00 - 12:20, Maxwell 402

INSTRUCTOR
Prof. Simon Weschle
Email: swweschl@syr.edu
Student Hours: Tuesday, 1:00 - 3:00, 332 Eggers or Zoom (see Syllabus for details)

COURSE DESCRIPTION
Data and data analysis are increasingly important for political science research, but also in the public discourse and the workplace. In this class, you will learn how to conduct data analysis yourself. We'll cover topics such as finding data, data cleaning and data manipulation, data visualization, and data analysis. Along the way, we'll learn basic statistical functions and plots in the powerful (and free) statistical program R. Throughout, the class takes an applied approach, so students will develop their own research project and conduct their own data analyses.

TEXTBOOK
Elena Llaudet and Kosuke Imai: Data Analysis for Social Science, A Friendly and Practical Introduction. Princeton University Press.
I will refer to the book as DSS. You can obtain via Amazon, Princeton University Press, or any other book retailer.

ASSIGNMENTS AND GRADING

  • Class Participation (15%): To succeed in this course, you have to attend class on a regular basis, come prepared by having worked through the assigned reading, and actively participate and ask questions.
  • Class Programming Review Exercises (10%): There will be short weekly review exercises that cover the basic R material we learned. Each exercise is graded as pass/fail, where a pass is worth 1 point and a fail worth 0.
  • Problem Sets (30%): There will be 5 to 6 problem sets in which you are asked to use what you have learned in class to analyze different kinds of data. The answers to these problem sets should be typed. They are graded on a scale from 1 to 5, and late submissions will be penalized by 1 point for every 24 hours past the due date. Any extension requests must be made to me personally and as soon as possible.
  • Data Analysis Memos (15%): Your main task in this class will be to write a paper with your own data analysis on a question that is of interest to you. To help you along the way, you will submit reports about the individual steps throughout the semester. The memos will cover: your research question and potential confounders, your data, data cleaning, descriptive statistics, bivariate relations, (first) regression results. The memos should be short (2-3 pages) and typed in their entirety. They are graded on a scale from 1 to 5, and late submissions will be penalized by 1 point for every 24 hours past the due date. Any extension requests must be made to me personally and as soon as possible. I will provide feedback to every memo to help you improve your final paper.
  • Data Analysis Paper (30%): Your final paper should set out your research question, explain the data and statistical methods you use to investigate it, and describe what, based on your data analysis, the answer is. There is no minimum or maximum paper length. It should be as long as needed, but as short as possible. The papers are due on May 6.

SYLLABUS
For more detailed information on class policies and all of the fine print, please see the Syllabus.

CLASS TOPICS
Below is a list of topics that the class will cover. The exact week-to-week schedule will be developed and updated throughout the semester to reflect student interest and the pace at which we are progressing.

  • Getting Started with R
  • Causality with Randomized Experiments
  • Finding and Cleaning Data
  • Inferring Population Characteristics, Bivariate Relations
  • Predicting Outcomes, Causality with Observational Data, Linear Regression
  • Spatial Data, Network Data, Text as Data, Website Scraping (We will choose some of those topics based on student interest)
  • Data Analysis Paper Workshop

CLASS SCHEDULE
Below is a continuously updated class schedule. It contains information on what topics we are covering as well as on the readings and assignments. Please check this site EVERY WEEK.

Week 1: Getting Started with R

  • Tuesday (1/23): Try to install R and RStudio Instructions can be found in DSS Ch. 1.5 (posted on Blackboard). Then, try to work through DSS Ch. 1.6 (posted on Blackboard).
  • Thursday (1/25): DSS 1.7-1.9 (posted on Blackboard).
  • Slides: Class 1, Class 2
  • Code: Class 1, Class 2
  • Data: turnout.csv, STAR.csv

Week 2: Estimating Causal Effects

Week 3: Inferring Population Characteristics

Week 4: Inferring Population Characteristics,Continued; Finding and Cleaning Data

Week 5: Finding and Cleaning Data, Continuted

  • Tuesday (2/20): Revisit Weinberg, Harel, and Abramowitz, Ch. 4 (Blackboard)
  • Thursday (2/22): No class. Work on finding/cleaning data for your final project instead.
  • Slides: Class 9
  • Code: Class 9
  • Review Exercise 4 (due 3/1, submit on Blackboard).
  • Data Analysis Memo 2 (due 3/1, submit on Blackboard).

Week 6: Predicting Outcomes Using Linear Regression

Week 7: Estimating Causal Effects with Observational Data

Week 8: Quantifying Uncertainty

Week 9: Regression Extensions

Week 10: Extensions and Consolidation

Week 11: Spatial Data

Week 12: Text as Data, Webscraping

Week 13: Text as Data, Webscraping