In this work, we study the use of Twitter by House, Senate and gubernatorial candidates during the midterm (2010) elections in the U.S. Our data includes almost 700 candidates and over 690k documents that they produced and cited in the 3.5 years leading to the elections. We utilize graph and text mining techniques to analyze differences between Democrats, Republicans and Tea Party candidates, and suggest a novel use of language modeling for estimating content cohesiveness. Our findings show significant differences in the usage patterns of social media, and suggest conservative candidates used this medium more effectively, conveying a coherent message and maintaining a dense graph of connections. Despite the lack of party leadership, we find Tea Party members display both structural and language-based cohesiveness. Finally, we investigate the relation between network structure, content and election results by creating a proof-of-concept model that predicts candidate victory with an accuracy of 88.0%.
The Party is Over Here: Structure and Content in the 2010 Election
1. The Party is Over Here: Structure and Content in the 2010 Election Avishay Livne, Matt Simmons, Eytan Adar, Lada Adamic University of Michigan ICWSM 2011 1
2. Twitter and Politics 14% of Internet users engaged in political activity via social media in 2008 => 22% in 2010 [Smith 2008, 2010]. Political Polarization on Twitter – [Conover et al. 2011] Characterizing Debate Performance via Aggregated Twitter Sentiment – [Diakopoulos& Shamma2010] 2 Seal of Approval
3. Twitter in Politics Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment – [Tumasjan et al. 2010] Hypothesis: Portion of tweets mentioning a party => portion of votes this party will get. Achieved MAE of 1.65%, traditional polls achieved 0.8%-1.48%. German federal elections 2009. In the same time a German octopus was used to predict the World Cup ‘10 results... … correctly predicting 8/8 matches Paul the Octopus 3 Our goal: Can we extract more information from Twitter?
16. Language Similarity vs. Graph Distance 16 With retweets Without retweets Not similar Similar The closer two candidates are the more similar their content is. This trend diminishes after 3 steps.
17. Latent Topics 17 Distribution of documents over topics Topics 1 LDA Documents Documents 2 Average over party’s documents Distribution of parties over topics Topics Which topics are more affiliated with each party Compare distributions Parties
23. Summary Twitter is prevalent in campaigns Tea party more aggressive and Twitter-savvy Republicans more aligned (network) Democrats less cohesive (content) Content similarity vs. network distance Network and language improve prediction 23
24. Future Work More extensive topic & sentiment analysis Temporal analysis Integrating funding Integrating constituents (the crowd) 24
25. Thanks Thank you! Thanks to Abe Gong for helpful insights. More information in the paper. Contact info: avishay@umich.edu 25