New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Are Twitter Users Equal in Predicting Elections
1. Are Twitter Users Equal in
Predicting Elections?
A Study of User Groups in Predicting 2012 U.S.
Republican Presidential Primaries
1
Lu Chen, Wenbo Wang, Amit Sheth. Are Twitter Users Equal in Predicting Elections? A Study of User Groups in
Predicting 2012 U.S. Republican Presidential Primaries. The 4th International Conference on Social Informatics
(SocInfo2012), 2012.
Lu Chen
chen@knoesis.org
Wenbo Wang
wenbo@knoesis.org
Amit Sheth
amit@knoesis.org
2. There is a surge of interest in building systems that harness the
power of social data to predict election results.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth 2
# of Facebook users
talking about each
candidate; who is talking
about which candidate :
age, gender, state
Twitter users’
Positive/negative
opinions about
each candidate
Tweets from
@BarackObama and
@MittRomney organized
by engagement on Twitter
# of Facebook
“likes” & Twitter
“follower”
Real time semantic
analysis of topic,
opinion, emotion, and
popularity about each
candidate
3. 3
One problem seems to be ignored:
Are social media users equal
in predicting elections?
They may be from different countries and states.
They may be have different political beliefs.
They may be of different ages.
They may engage in the elections in different ways
and with different levels of involvement.
……
They may be … different in predicting elections…?
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
WHOSE opinion really matters?
4. 4
o We Study different groups of
social media users who engage in
the discussions of 2012 U.S.
Republican Presidential Primaries,
and compare the predictive power
among these user groups.
Data: Using Twitter Streaming API, we collected tweets that contain the words
“gingrich”, “romney”, “ron paul”, or “santorum” from 01/10/2012 to 03/05/2012 (Super
Tuesday was 03/06/2012). The dataset comprises 6,008,062 tweets from 933,343 users.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
6. 1
6
More than half of the users posted only one tweet. Only 8% of the
users posted more than 10 tweets.
A small group of users (0.23%) can produce a large amount of tweets
(23.73%) – Is tweet volume a reliable predictor?
The usage of hashtags and URLs reflects the users' intent to attract
people's attention on the topic they discuss. The more engaged users
show stronger such intent and are more involved in the election event.
2
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
7. 3
7
The original tweet-dominant group accounts for the biggest
proportion of users in every user engagement group.
A significant number of users (34.71% of all the users) belong to the
retweet -dominant group, whose voting intent might be more difficult
to detect.
Engagement
Degree
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
According to users' preference on generating their tweets, i.e., tweet mode, we
classified the users as original tweet-dominant, original tweet-prone, balanced,
retweet-prone and retweet-dominant.
8. 4
8
More engaged users tend to post a mixture of content, with similar
proportion of opinion and information, or larger proportion of
information.
Engagement
Degree
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
We use target-specific sentiment analysis techniques to classify each tweet as
positive or negative – whether the expressed opinion about a specific candidate is
positive or negative. The users are categorized based on whether they post more
information or more opinion.
9. 5
9
Right-leaning users were (as expected) more involved in republican
primaries in several ways: more users, more tweets, more original
tweets, higher usage of hashtags and URLs.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
We collected a set of Twitter users with known political preference from Twellow
(http://www.twellow.com/categories/politics). Based on the assumption that a user tends
to follow others who share the same political preference as his/hers, we identified the
left-leaning and right-leaning users utilizing their following/follower relations. We
tested this method using a datasets of 3341 users, and it showed an accuracy of 0.9243.
10. 6
10
The Pearson's r for the correlation between the number of users/tweets
and the population is 0.9459/0.9667 (p<.0001).
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
We utilized the background knowledge from LinkedGeoData to identify the
states from user location information.
If the user's state could not be inferred from his/her location in the profile, we
utilized the geographic locations of his/her tweets. A user was recognized as from
a state if his/her tweets were from that state.
11. Predicting a User's Vote
• Basic idea: for which candidate the user shows the most support
– Frequent mentions
– Positive sentiment
11
Nm(c): the number of tweets mentioning the candidate c
Npos(c): the number of positive tweets about candidate c
Nneg(c): the number of negative tweets about candidate c
(0 < < 1): smoothing parameter
(0 < < 1): discounting the score when the user does not
express any opinion towards c.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
The user
posted opinion
about c
The user
mentioned c but
did not post
opinion about c
More mentions,
higher score
More positive/less
negative opinions,
higher score
12. Prediction Results
12
We examine the predictive power of different user groups in predicting the
results of Super Tuesday races in 10 states.
To predict the election results in a state, we used only the collection of
users who are identified from that state.
The results were evaluated in two ways: (1) the accuracy of predicting
winners, and (2) the error rate between the predicted percentage of votes
and the actual percentage of votes for each candidate.
We examined four time windows -- 7 days, 14 days, 28 days and 56 days
prior to the election day. In a specific time window, a user's vote was
assessed using only the set of tweets he/she created during this time.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
13. 7
13
The prediction accuracy:
Engagement Degree: High > Low or Very Low
Tweet Mode: Original Tweet-Prone > Retweet-Prone
Content Type: In a draw
Political Preference: Right-Leaning >> Left Leaning
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
14. 14
Revealing the challenge of
identifying the vote intent of “silent
majority”
Retweets may not necessarily
reflect users' attitude.
Prediction of user’s vote based on
more opinion tweets is not
necessarily more accurate than the
prediction using more information
tweets
The right-leaning user group provides
the most accurate prediction result. In
the best case (56-day time window), it
correctly predict the winners in 8 out
of 10 states with an average
prediction error of 0.1.
To some extent, it demonstrates the
importance of identifying likely voters
in electoral prediction.
8
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
15. 15
Our findings
Twitter users are not “equal”
in predicting elections!
The likely voters’ opinions matter more.
Some users’ opinions are more difficult to identify because
of their lower levels of engagement
or the implicitly of their ways to express opinions.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
16. More Work need to be
done…
• Identifying likely/actual voters
• Improving sentiment analysis
techniques
• Investigating possible data biases
(e.g., spam tweets and political
campaign tweets) and how they
might affect the results
and more …
16Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
17. 17Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
It is actually about tracking public opinion.
PollingorSocial Media Analysis?
1. Sample size
2. Representative of the target population
3. Accurate measure of opinions
4. Timeliness
18. 18Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
1 Sample Size
Polling Social Media Analysis
Thousands of people Millions of people
19. 19
2 Representative of the Target Population
Polling Social Media Analysis
[1] Can Social Media Be Used for Political Polling? http://www.radian6.com/blog/2012/07/can-social-media-be-used-for-political-polling/
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
About 95% of US homes can be
reached by landline telephone and
cell phone.
Sampling the target population
randomly.
Weighting the sample to census
estimates for demographic
characteristics (gender, race, age,
educational attainment, and
region).
About 60% of American adults
use social networking sites.
Difficult to do random sampling.
Limited demographic data
(although with some work, can be
improved).
20. 20Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
3 Accurate measure of opinions
Polling Social Media Analysis
Ask people what they think
Look at what people talk about
and extract their opinions
Not as accurate as Polling
Who will
you vote
for?
……
21. 21Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
4 Timeliness
Polling Social Media Analysis
What is happening now
Not be able to track people’s
opinion in real time
22. Social Media Analysis – Promising but Very
Challenging
22
Increasing number of social
media users
Convenient and comfortable
way to express opinions
The analysis can be done in real
time
Lower cost
A great complement (if not
substitute) for polling
Extracting demographic
information
Identifying the target population
whose opinion matter, e.g. the
likely voters in electoral prediction
Discriminate personal opinion
from the voice of mainstream
media and political campaign
More accurate sentiment
analysis/opinion mining,
especially the identification of
opinions about a specific object
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
23. Subjective Information Extraction, Lu Chen 23
Our Twitris+ System kept tracking
people’s opinion on 2012 U.S.
Presidential Election in real time and this
is what we saw on the Election Day …
26. 26
Sentiment change about
Barack Obama
Sentiment change about
Mitt Romney
Positive/negative topics
that contribute to such
change
Analysis can be
performed at location or
issue based level
A key innovation in sentiment analysis, employed in Twitris+, is topic specific sentiment
analysis -- to associate sentiment with an entity. The same sentiment phrases may assigned
different polarities associated with different entities.
Twitris+ tracks sentiment trend about different entities, and identifies topics/events that
contribute to sentiment changes. The result is updated every hour.
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
27. Twitris+ Insights in 2012 Presidential Debates
27
How was Obama doing in the first debate?
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
28. 28
How was Obama doing in the second debate?
Red Color: Negative Topics
Green Color: Positive Topics
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
29. 29
Obama VS Romney in the third debate
Are Twitter Users Equal in Predicting Elections? Lu Chen, Wenbo Wang, Amit Sheth
Obama
Romney
30. Thank you !
Subjective Information Extraction, Lu Chen 30
More about this study:
http://wiki.knoesis.org/index.php/ElectionPrediction
Kno.e.sis Center:
http://knoesis.wright.edu/
Twitris+:
http://twitris.knoesis.org/
Semantics driven Analysis of Social Media:
http://knoesis.org/research/semweb/projects/socialmedia
Notas del editor
Tweet volume alone may not be a reliable predictor, since a small group of users can produce a large amount of tweets. E.g., political campaign, promotion tweets
Some of the Twellow preferences are self declared
There is very strong correlation between the number of Twitter users/tweets from each state and the population of each state. Usually the Pearson's correlation coefficient between 0.9 to 1.0 indicates Very strong correlation.
Categorized by engagement degree: the high engagement users achieved better prediction results. It may be due to two reasons. (1) high engagement users posted more tweets. It is more reliable to make the prediction using more tweets. (2) more engaged users were more involved in the election event, and were more likely to vote.Categorized by tweet mode: the original tweet prone users achieved better prediction results. It might suggest the difficulty of identifying users' voting intent from retweets.Categorized by content type: No significant difference is found between two groupsCategorized by political preference: the right-leaning user group achieved significantly better results than left-leaning group.