Más contenido relacionado La actualidad más candente (20) Similar a An overview of Twitter analytics (20) An overview of Twitter analytics1. An overview of Twitter
analytics
Wasim Ahmed (wahmed1@sheffield.ac.uk)
(Twitter: @was3210)
Acknowledgements to Sergej Lugovic (@sergejlugovic)
Contemporary Issues in Economy & Technology (CIET).
15th June 2016. Split, Croatia.
2. 19/06/2016 © The University of Sheffield
2
About me
• Second Year PhD student from the Information
School, University of Sheffield (UK).
• PhD examines content that is shared on Twitter
during infectious disease outbreaks.
• Run a social media research blog (over 11
thousand hits)
3. 19/06/2016 © The University of Sheffield
3
About me
• Currently working on a PhD project
examining infectious disease outbreaks on
Twitter
• Alongside PhD assisted security research
teams, government, media, and
educational organisations globally
4. About me…continued
• Also work part time as a Research
Associate: Social Media specialist
19/06/2016 © The University of Sheffield
4
5. 19/06/2016 © The University of Sheffield
5
Overview of workshop
• Part 1 – Overview of Twitter, and case
studies examples
• Part 2 – Overview of Twitter analytics
software / interactive sessions
• Part 3 – Q&A on tools – make sure to jot
down some questions!
6. 19/06/2016 © The University of Sheffield
6
Aims
• Better understand Twitter as a platform
• Provide examples of case studies using
social media analytics
• Gain knowledge and awareness of Twitter
analytics
7. Twitter
• Twitter allows brief <140 character text
updates, known as ‘tweets’, to be shared
with other users
• Tweets can contain thoughts, feelings,
activities, and opinions (Chew and
Eysenbach, 2010).
19/06/2016 © The University of Sheffield
7
8. Twitter
• Twitter reports having 316 million monthly
active users
• There being 500 million tweets per day
• 80% of active Twitter users using a mobile
device (About Twitter, n.d.).
19/06/2016 © The University of Sheffield
8
9. Why Twitter (data)?
• See my LSE impact blog post baseline comparison to Facebook
• Twitter is a popular platform in terms of the media attention it receives and it
therefore attracts more research due to its cultural status
• Twitter makes it easier to find and follow conversations (i.e., by both its search
feature and by tweets appearing in Google search results)
• Twitter has hashtag norms which make it easier gathering, sorting, and expanding
searches when collecting data
• Twitter data is easy to retrieve as major incidents, news stories and events on
Twitter tend to be centred around a hashtag
• The Twitter API is more open and accessible compared to other social media
platforms, which makes Twitter more favourable to developers creating tools to
access data. This consequently increases the availability of tools to researchers.
• Many researchers themselves are using Twitter and because of their favourable
personal experiences, they feel more comfortable with researching a familiar
platform.
19/06/2016 © The University of Sheffield
10. Different types of Twitter API
• Application Programming Interface
• Twitter’s Search API – focused on relevance and not
completeness, some tweets and users may be missing
from results (7 days back in time up to 3200 queries)
• Twitter Streaming API – The Streaming APIs give
developers low latency access to Twitter’s global
stream of tweet data (live stream)
• Firehose API – in theory, 100% of Twitter data (most
software allows up to 30 days worth of historical
tweets)
19/06/2016 © The University of Sheffield
11. What if I want data going back
more than 30 days?
• In most instance you will have to pay for it
• I use Texifter (@texifter) with DiscoverText
(@discovertext)
• Can range from not that expensive to
very expensive depending on query and
time
19/06/2016 © The University of Sheffield
12. Legal issues
• Sharing of Twitter datasets is prohibited
see https://dev.twitter.com/terms/api-terms
• However, sharing Tweet IDs (to look up
the tweets used is permissible). This is
useful for reproducibility.
19/06/2016 © The University of Sheffield
13. 19/06/2016 © The University of Sheffield
13
Business Expenditure
• Businesses spend millions of dollars every
year tailoring their brands and protecting
them
• Historically traditional media and one-to-
many approach gave control to brands via
advertisers
14. Shift of Power
19/06/2016 © The University of Sheffield
14
• With emergence of social media the
traditional brand communication process
has reached something of a crisis
• Traditional communication lines are rapidly
breaking down
15. 19/06/2016 © The University of Sheffield
15
Shift of Power
• When it became clear that Twitter was becoming
an important social networking site and public
communication platform
• A number of businesses and social media
marketing professionals attempted to exploit
the platform for commercial purposes
16. Toyota
• Toyota had to recall a number of its cars in
2009 ad 2010 due to a serious safety
faulty which resulted in the deaths of over
50 people
• Unlike Sony - they immediately went into
Damage Control
19/06/2016 © The University of Sheffield
16
17. • As soon as the recall crisis start getting
media attention Toyota quickly put
together an ‘Online Newsroom’ and a
‘Social Media Strategy Team’ to
coordinate all the media releases
19/06/2016 © The University of Sheffield
17
Toyota
18. Sony PlayStation Network
• In mid-April 2011 the Playstation Network was
suddenly shut down without explanation
• Frustrations quickly spread through social media
sites such as Twitter as gamers around the
world voiced their annoyance at not being able
to access their online games
19/06/2016 © The University of Sheffield
18
19. Sony PlayStation Network
• The lack of regular updates and
information from Sony served to incense
users
• Users struggled to determine what was
fact and what was rumour on Twitter
19/06/2016 © The University of Sheffield
19
20. Sony PlayStation Network
• Lapse in communication was
incomprehensive to consumers
• Lack of regular updates and information
only served to incense users further
19/06/2016 © The University of Sheffield
20
21. • “I think It is pretty disgusting that Sony have waiting 7
days to tell users that their Credit Card details may have
been compromised”.
• “I bet the hacker will get emails out quicker than Sony!”
19/06/2016 © The University of Sheffield
21
Sony PlayStation Network
22. Toyota
• While there was still anger and negative
viewpoints shared through social media
services,
• Company was able to minimise their
impact by eliminating confusion and
keeping the consumer base regularly
informed of developments
19/06/2016 © The University of Sheffield
22
23. Brand Management
• The two cases have highlighted brands
need to know how they are being
mentioned across social media profiles
• Social Media Analytics is now a huge
market
19/06/2016 © The University of Sheffield
23
24. Types of analysis possible
• Sentiment analysis has the potential to
work well with Twitter data, as tweets are
consistent in length (i.e., <= 140)
• However sarcasm is difficult to detect
within tweets.
• SentiStrength algorithm
(http://sentistrength.wlv.ac.uk/)
19/06/2016 © The University of Sheffield
24
25. Types of analysis possible
• Time series analysis is normally used
when examining tweets overtime to see
when a peak of tweets may occur. One I
made today:
19/06/2016 © The University of Sheffield
25
26. Last 30 days time series graph
of Croatia
19/06/2016 © The University of Sheffield
26
27. Context behind the peak June 12th 2016
19/06/2016 © The University of Sheffield
27
Euro championship, Croatia win their
opening game:
28. Types of analysis possible
• Network analysis is used to visualize the
connections between people (who is
connected to who?)
• Who is the most influential Twitter user?
Various algorithms can be used, a popular
algorithm is the Betweenness Centrality
Algorithm
19/06/2016 © The University of Sheffield
28
29. Types of analysis possible
• Network analysis is used to visualize the
connections between people (who is
connected to who?)
• Who is the most influential Twitter user?
Various algorithms can be used, a popular
algorithm is the Betweenness Centrality
Algorithm
19/06/2016 © The University of Sheffield
29
31. Types of analysis possible
• Machine Learning e.g. using a text
classifier such as the naive Bayes
algorithm
• Involves training data e.g. manually coding
a subset of data e.g, 100 tweets in a
dataset of a 1,000 tweets and the
algorithm will automatically classifier the
remaining data
19/06/2016 © The University of Sheffield
31
32. Part 2 of the workshop
• Part 2 of the workshop will provide an
overview of some of the cutting edge
analytics platforms out there
• Pause here and create a Twitter account
(if you don’t have one)
19/06/2016 © The University of Sheffield
32
34. Visibrain Focus
• Unfortunately not possible to get access for
delegates
• However, Visibrain offer a free 30 day trial
• I can provide an overview on this machine
19/06/2016 © The University of Sheffield
34
36. Echosec (fee version available)
• Location based social media search by
location rather than keywords
• Allows you to examine a specific
geographical area by drawing on
Facebook, Twitter, Instagram, Sina Weibo,
Youtube, Foursquare, Flickr, and VK APIs
19/06/2016 © The University of Sheffield
36
37. 19/06/2016 © The University of Sheffield
37
Examples of case studies using
Echosec
• Echosec was used following the April 2015 Nepal
Earthquake
• Apps such as four-square have potential to provide first
responders ability to check where things are
• Geographically searching social media data in an area
can show you what you are looking for in an emergency
• Can examine locations of affected areas and see where
people have stopped posting from
38. 19/06/2016 © The University of Sheffield
38
Real Examples of case studies
using Echosec
39. Echosec
• Navigate to https://app.echosec.net/
• Near the bottom left there will be an option to
enter a location to search for
• See what intelligence you can gain using
location based search. (5-10 minutes)
19/06/2016 © The University of Sheffield
39
40. Follow the Hashtag
• Free version available to access
• Navigate to
http://www.followthehashtag.com/
19/06/2016 © The University of Sheffield
40
42. Twitonomy
• Free version available to access navigate
to: https://www.twitonomy.com/
19/06/2016 © The University of Sheffield
42
43. NodeXL
• Social media analysis that looks at the
structure of the networks when using
social media
• One particular tool is called NodeXL,
unfortunately not enough time to download
and install, but can demonstrate on this
machine
19/06/2016 © The University of Sheffield
43
44. NodeXL
19/06/2016 © The University of Sheffield
44
• To examine network graphs currently
being created and uploaded.
• Navigate to the NodeXL graph gallery
http://www.nodexlgraphgallery.org/
45. NodeXL – Graph Gallery
19/06/2016 © The University of Sheffield
45
46. NodeXL
19/06/2016 © The University of Sheffield
46
• Example graphs on the Gallery
• For interpretation see Smith, Rainie,
Shneiderman, & Himelboim (2014)
• Also see this example of 6 types of
network graph
47. University of Sheffield Project
19/06/2016 © The University of Sheffield
47
• Produced a report for the Head of Digital
at the University of Sheffield Stephen
Thompson examining mentions of the
University over previous 12 months
48. University of Sheffield Project
• Step 1 – Obtain historical data using a
provider such as Sifter and data placed
into DiscoverText
• Step 2 – Using DiscoverText de-duplicate
data by removing exact duplicates, and
near duplicate clusters
19/06/2016 © The University of Sheffield
48
49. University of Sheffield Project
• Step 3 – Of a reduced dataset take a 10%
sample and manually code/ and or train a
machine classifier to code the entire
dataset.
• I used DiscoverText which is a cloud-
based, collaborative text analytics solution,
and which allows the above.
19/06/2016 © The University of Sheffield
49
51. University of Sheffield Project
19/06/2016 © The University of Sheffield
51
• By removing duplicates and near
duplicates the sample of N=43,521 tweets
became a total of N=13,078 tweets.
• Prevents from categorizing only popular
mentions.
52. University of Sheffield Project
19/06/2016 © The University of Sheffield
52
• A 10% random sample of tweets were
extracted from the filtered dataset (i.e.,
10% of 13,078) to leave a total of n=1,198
tweets (total coding time 19 hours 29
minutes and 20 seconds).
53. University of Sheffield Project
19/06/2016 © The University of Sheffield
53
• Conclusions and key findings:
• A university that is very well engaged with its
students, the public, and the mainstream
media
• Ranked highly amongst other Russell Group
universities for followers, and mentions
54. Conclusion
19/06/2016 © The University of Sheffield
54
• There is no ‘best’ social media analytics
tool as they all offer something different
and I use them in combination