4. Problem - Solution
• “Big data” are datasets that grow so large that
they become awkward to work with using on-
hand database management tools.
- Wikipedia.org
• graphviz is open source software developed
by AT&T Labs which aids analysis of “big data”
through visualization.
8. A Little Graph Theory Lesson
• Nodes (aka vertices) – are the fundamental
unit out of which graphs are formed
• Edges (directional arcs) – is a line connecting
two nodes
• Graph – is a collection of nodes and edges
9.
10. Let’s Play 6 Degrees of Kevin Bacon!
http://public.research.att.com/~volinsky/cgi-bin/prox/help.pl
17. Beyond graphviz
• gephi.org (also open source)
– Data visualization over time
– WYSIWYG controls for layout style, node/edge
size, borders, colors, labels, etc
– Group nodes by data points
– Community-detection
– Reduction to filter out the noise
• http://www.ted.com/talks/hans_rosling_show
s_the_best_stats_you_ve_ever_seen.html
Most dramatic part is from 03:15 to 05:06