The Python Elections Library and Other Programming Tools for Election Analysis...
During this campaign season, many political scientists are conducting research experiments that use online datasets as their primary units of analysis. For instance, in this blog's previous post, real-time Twitter data was collected and then analyzed to get a sense of popular sentiment about Chris Christie's endorsement of Donald Trump. Scraping data from social media sites is an increasingly common focus of statistical research.
To this end, there are numerous programming tools available to political scientists. Let me start by saying that, personally, my language of choice for tasks of this sort has long been Java, although this year I've migrated over to Python. Meanwhile, Web APIs have been around for years so that researchers can easily access the data on Twitter, Facebook, Google, Reddit, etc. Also, for this 2016 election cycle there are a few notable new additions to our collective toolkit that, combined with some useful already-existing tools, really improve a researcher's capabilities. Here's what I'm using...
- Tweepy - still the best Python-based library for accessing the Twitter API.
- The Python Elections Library - a pay-for library that provides access to all federal, state, and local election results, as well as delegate estimates for the presidential nomination contests.
- Elex - the Associated Press' brand new command-line tool for accessing current election data.
- The Watson Emotion Analysis API - IBM's brand new API that was just released in Beta. Whereas Alchemy analyzes language sentiment as positive, negative, or neutral, the Emotion Analysis API detects joy, fear, anger, sadness, and disgust, and rates them by order of magnitude.
- Matplotlib - the common Python library for visualizing the data with charts and graphs (although I'm currently searching for a better visualization tool, if you have recommendations).
These tools are a great starting point for collecting and analyzing social media data. Now if only you didn't have to pay a fee for some of these large datasets, or for a sentiment analyses of them, then we'd really be cooking.