Wednesday, November 25, 2015

A Facebook Voting Model for Predicting Elections...

The race has been on for the past several years for scholars to figure out exactly how to best use social media metrics to help predict electoral outcomes.  Matthew MacWilliams is the most recent entrant into the fray.  He recently published an article in Political Science and Politics in which he proposes a new model for forecasting elections incorporating Facebook data - specifically, the number of Facebook "Likes" a candidate's page receives as well as Facebook's "People Talking About This" (PTAT) statistic.

For those of you who may be skeptical of a purely Internet effect on voting behavior, these were not the only variables used.  Rather, MacWilliams factored in the Facebook participation variable along with other well-established metrics that also track partisanship and incumbency advantage.

The results:  Testing the model against 15 of the most competitive Senate races in 2012, the Facebook model proved to more accurately predict the Senate winner in 6 of the 8 weeks leading up to Election Day than the fundamentals model that does not include Facebook data.  Additionally, it proved more accurate than "polls-of-polls" forecasts in 5 of those 8 weeks.

So this new Facebook model seems great, right?  Well, while the research is solid and the experiment thoughtfully constructed, a few questions are dying to be answered.

For starters, why is Facebook the sole focus of the study, and why only those two metrics?  I understand that Facebook data is methodologically more available and accessible than Big Data datasets, including Twitter, but that alone does not make Facebook the ideal choice for representing the dynamics of what's happening across all of social media.  Different sites have unique characteristics, and thus can skew results.  Granted, something had to be chosen, but ideally some cross-section sample of a number of social media sites might be more instructive.

Second, when compared to the poll-of-poll averages, the Facebook model was better at forecasting in 5 of the 8 weeks.  Yes, that's more accurate, but 5-3 is barely so, and it's not exactly a resounding confirmation.  Also, as MacWilliams himself points out, the Facebook model was better at forecasting outcomes the farther the prediction was from Election Day.  In other words, the closer to Election Day, the less accurate it was.  WHY?  That's an extremely interesting footnote - and surprisingly counter-intuitive considering that one would expect the online metric to more accurately reflect people's sentiments in real-time.

Finally, the article closes by mentioning its "grand prize" goal:  to be useful in predicting presidential election outcomes.  However, to me it may be more interesting to determine how the Facebook model might hold up when applied to state/local elections?  Because of lower voter turnout and less engagement, generally, in local elections, I could almost make the argument that measuring Facebook "Likes" and the number of "People Talking About This" could either confirm the model's accuracy or completely invalidate it.  On the local level, it seems like it could go either way, which is why it would be great to see an experiment done on that front.

Overall, this research study was well done in a subfield where more experiments desperately need to be conducted.  For those readers out there not completely sold on the two Facebook metrics used, what alternative social media metrics do you think might be better?