8. Computer Engineer
Bangkok, Thailand
M.S. in Computer Science
Univ. of Maryland
Krist Wongsuphasawat / @kristw
9. Computer Engineer
Bangkok, Thailand
PhD in Computer Science
Univ. of Maryland
Information Visualization
Krist Wongsuphasawat / @kristw
10. Computer Engineer
Bangkok, Thailand
PhD in Computer Science
Univ. of Maryland
Information Visualization
Krist Wongsuphasawat / @kristw
IBM
Microsoft
11. Computer Engineer
Bangkok, Thailand
PhD in Computer Science
Univ. of Maryland
Information Visualization
Krist Wongsuphasawat / @kristw
IBM
Microsoft
Sr. Data Visualization Scientist
Twitter
17. data
at Twitter
“Tweets”
#events
TV Shows New Year
Earthquake
Oscars
Protest
Super Bowl
World Cup Election
Breaking news
…
18. data
at Twitter
“Tweets”
#events
TV Shows New Year
Earthquake
Oscars
Protest
Super Bowl
World Cup Election
Breaking news
…
#curiosity
Sleep pattern
Human behavior
Language …
19. data
at Twitter
“Tweets”
#events
TV Shows New Year
Earthquake
Oscars
Protest
Super Bowl
World Cup Election
Breaking news
…
#curiosity
Sleep pattern
Human behavior
Language …
What could we learn from the Tweets?
20. vis
data
at Twitter
“Tweets”
Tell stories about an event,
Pursue curiosity or inspiration
Goal:
21. vis
data
at Twitter
“Tweets”
Tell stories about an event,
Pursue curiosity or inspiration
(with deadline)
Goal:
27. Challenges
• Too much data
• Want only relevant Tweets
• hashtag: #BRA
• keywords: “goal”
• Need to aggregate & reduce size
• Long processing time (hours)
30. Workflow
Hadoop Cluster
Vertica
Pig / Scalding (slow) SQL
Data Storage
Tool
Your laptop Smaller dataset
31. Hadoop Cluster
Vertica
Pig / Scalding (slow) SQL
Data Storage
Tool
Tool node.js / python / excel (fast)
Final dataset
Your laptop
Workflow
Smaller dataset
32. vis
data
at Twitter
“Tweets”
Get data
1
2
Visualize
33. Visualize
• Peek into data
• Check data & test ideas
• Decide how to visualize
• Guided by data type
• Choose tools
• Start building
34. Visualize
• Peek into data
• Check data & test ideas
• Decide how to visualize
• Guided by data type
• Choose tools
• Start building
R d3
Tableau Yeoman
78. Time + Text + Geo State of the Union
twitter.github.io/interactive/sotu2014
79. Time + Text + Geo State of the Union
1) timeline + topic from Tweets
4) Density map of
Tweets about
selected topic
3) Volume of Tweets
by topics
during selected
part of the SOTU
2) context
(speech)
twitter.github.io/interactive/sotu2014
86. Time + Text + Geo (c) New Year 2014
twitter.github.io/interactive/newyear2014/
87. vis
data
at Twitter
“Tweets”
Get data
1
2
Visualize
88. vis
data
at Twitter
“Tweets”
Get data
1
2
Visualize
Evaluate
3
89. vis
data
at Twitter
“Tweets”
Get data
1
2
Visualize
Evaluate
3
Iterate!
90. Evaluation
• Self
• Peer feedback
• Non team members / Potential audience
91. vis
data
at Twitter
Get data
1
2
Visualize
Evaluate
3
92. vis
data
at Twitter
Get data
1
2
Visualize
Evaluate
3
big data => small data
93. vis
data
at Twitter
Get data
1
2
Visualize
Evaluate
3
big data => small data
What? Where? When?
94. big data => small data self, peer, external
vis
data
at Twitter
Get data
1
2
Visualize
Evaluate
3
What? Where? When?
95. big data => small data self, peer, external
vis
data
at Twitter
“Tweets”
Get data
1
2
Visualize
Evaluate
3
What? Where? When?
96. big data => small data self, peer, external
vis
data
at Twitter
“Tweets”
Get data
1
2
Visualize
Evaluate
3
What? Where? When?
• users
• followers graph
• logs
• etc.
!
• derived data: language, sentiment
97. big data => small data self, peer, external
vis
data
at Twitter
“Tweets”
Get data
1
2
Visualize
Evaluate
3
What? Where? When?
• users Who? …
• followers graph
• logs
• etc.
!
• derived data: language, sentiment
98. big data => small data self, peer, external
vis
data
at Twitter
“Tweets”
Get data
1
2
Visualize
Evaluate
3
What? Where? When?
• users Who? …
• followers graph
• logs
• etc.
!
• derived data: language, sentiment
(with deadline)
99. big data => small data self, peer, external
vis
data
at Twitter
“Tweets”
Get data
1
2
Visualize
Evaluate
3
What? Where? When?
• users Who? …
• followers graph
• logs
• etc.
(with deadline)
!
• derived data: language, sentiment @kristw / https://interactive.twitter.com
100. big data => small data self, peer, external
vis
data
at Twitter
“Tweets”
Get data
1
2
Visualize
Evaluate
3
What? Where? When?
• users Who? …
• followers graph
• logs
• etc.
(with deadline)
!
• derived data: language, sentiment @kristw / https://interactive.twitter.com
+ visualizations by @philogb, @miguelrios & @trebor
102. big data => small data self, peer, external
vis
data
at Twitter
“Tweets”
Get data
1
2
Visualize
Evaluate
3
What? Where? When?
• users Who? …
• followers graph
• logs
• etc.
(with deadline)
@kristw / https://interactive.twitter.com
+ visualizations by @philogb, @miguelrios & @trebor