Module 3 - cursus Big Data - Visualisation - deel 2
Instituut voor Permanente Vorming
Various visualisation techniques
(adapted from Heer, J., Bostock, M., & Ogievetsjy, V. (2010, May). A Tour through the Visualization Zoo - A survey of powerful visualisation techniques, from the obvious to the obscure. ACM Graphics , 8 (5), https://queue.acm.org/detail.cfm?id=1805128 )
Various interaction techniques
(adapted from Heer, J., & Shneiderman, B. (2012, February). Interactive Dynamics for Visual Analysis. Magazine Queue - Microprocessors , 10 (2), p. 30. http://queue.acm.org/detail.cfm?id=2146416 )
Big data to big to visualize?
Visualisation - techniques, interaction dynamics, big data
1. Post-‐academic
course
Big
Data
Post-‐academic
course
Big
Data
Joris Klerkx
Research Expert, PhD.
joris.klerkx@cs.kuleuven.be
@jkofmsk
Erik Duval
Professor
erik.duval@cs.kuleuven.be
@erikduval
Visualisatie - deel 2
Big Data - module 3
IVPV - Instituut voor PermanenteVorming
28-05-2015
2. To research, design, create and evaluate useful tools
that augment the human intellect
By
‘augmen+ng
human
intellect’
we
mean
increasing
the
capability
of
a
man
to
approach
a
complex
problem
situa+on,
to
gain
comprehension
to
suit
his
particular
needs,
and
to
derive
solu+ons
to
problems
(Douglas
Engelbart,
1962).
Augment group - HCI research lab
Dept. Computerwetenschappen
KU Leuven
https://augmenthuman.wordpress.com
Music
Technology Enhanced
Learning
e-health
Research 2.0
Health
Media
(Consumption)
Technology Enhanced Learning
Science 2.0
6. Humans have advanced perceptual abilities
Humans have little short term memory
Externalize data by using interactive, visual encodings
Our brains makes us extremely good at recognizing visual patterns
Our brains remember relatively little of what we perceive
9. Visual mapping
Encode data characteristics into visual form
Each mark (point, line, area,…) represents a data element
Think about relationships between elements (position)
“Simplicity is the ultimate sophistication.”
Leonardo daVinci
10. J. Mackinlay. Automating the design of graphical presentations of relational information. ACM Transactions On Graphics, 5(2):110–141, 1986.
16. Today
• Tour through the visualisation zoo
• Interactive Dynamics
• Is there too much data to visualize?
• Tools?
17. A tour through the
visualization zoo
Heer,
J.,
Bostock,
M.,
&
Ogievetsjy,
V.
(2010,
May).
A
Tour
through
the
VisualizaMon
Zoo
-‐
A
survey
of
powerful
visualisaMon
techniques,
from
the
obvious
to
the
obscure.
ACM
Graphics
,
8
(5),
hTps://queue.acm.org/detail.cfm?id=1805128
19. Relative changes in time-series data
An index chart is an interactive line chart that shows percentage changes
for a collection of time-series data based on a selected index point.
http://homes.cs.washington.edu/~jheer//files/zoo/ex/time/index-chart.html
20. Aggregated time-series data
A stream graph visually summates time-series values
http://hci.stanford.edu/jheer/files/zoo/ex/time/stack.html
26. The horizon graph is a technique for increasing the data density
of a time-series view while preserving resolution.
Sizing the Horizon: The Effects of Chart Size and Layering on the Graphical Perception of Time Series Visualizations
Jeffrey Heer, Nicholas Kong, Maneesh Agrawala
ACM Human Factors in Computing Systems (CHI), 2009. pp. 1303 - 1312. Best Paper Award
PDF (442K)
30. Statistical Distributions
Reveal how a set of numbers is distributed and thus help an
analyst better understand the statistical properties of the data
32. A stem-and-leaf plot bins numbers according to the first significant digit, and
then stacks the values within each bin by the second significant digit.
http://homes.cs.washington.edu/~jheer//files/zoo/ex/stats/stem-and-leaf.html
34. Box-and-whisker plot, which can convey statistical
features such as the mean, median, quartile boundaries,
or extreme outliers.
http://admin-apps.webofknowledge.com/JCR/help/h_boxplot.html
35. Statistical distribution of data
The Q-Q plot compares two probability distributions by graphing
their quantiles against each other.
http://hci.stanford.edu/jheer/files/zoo/ex/stats/qqplot.html
36. Representing
relationships/
correlations among
multiple variables.
A scatter plot matrix
(SPLOM) uses small
multiples of scatter
plots showing a set of
pairwise relations
among variables
http://homes.cs.washington.edu/~jheer//files/zoo/ex/stats/splom.html
graphing every pair
of variables in two
dimensions
38. Maps
Mostly based upon a cartographic projection: a mathematical
function that maps the three-dimensional geometry of the
Earth to a two-dimensional image
Other maps knowingly distort or abstract geographic features
to tell a richer story or highlight specific data.
http://geoawesomeness.com/topics/web-maps/
http://unfoldingmaps.org
http://ffffound.com/home/tillnm/found/
39. • Google Maps - Well rounded, established mapping solution, especially for non-developers
to get a basic map on the web, along with all the powers that Google is (in)famous for.
• OpenLayers - For situations when other mapping frameworks can’t solve your spatial
analysis problems.
• Leaflet - Currently, easily the best mapping framework for general mapping purposes,
especially if you don’t need the additional services that MapBox or CartoDB provide.
• MapBox - Fast growing and market changing mapping solution for when you want more
control over map styling or have a need for services that others are not providing, such as
detailed satellite images, geocoding or directions.
• Unfolding - to create interactive maps and geovisualizations in Processing and Java
http://www.toptal.com/web/the-roadmap-to-roadmaps-a-survey-of-the-best-online-mapping-tools
Typical Mapping Tools
52. Graduated Symbol Maps places symbols/glyphs over an
underlying map
http://homes.cs.washington.edu/~jheer//files/zoo/ex/maps/symbol.html
Graduated Symbol Maps places symbols over an underlying map
53. A cartogram distorts the shape of geographic regions so that the
area directly encodes a data variable
http://homes.cs.washington.edu/~jheer//files/zoo/ex/maps/cartogram.html
57. There is no perfect map
How could you actually compare sizes of different continents and countries?
https://www.youtube.com/watch?v=KUF_Ckv8HbE
58. Hierachies
Most data can be organised into natural hierarchies
Special visualization techniques exist to leverage hierarchical
structure, allowing rapid multiscale inferences: micro-observations of
individual elements and macro-observations of large groups
59. A node-link diagram with Reingold-Tilford algorithm
http://hci.stanford.edu/jheer/files/zoo/ex/hierarchies/tree.html
60. The dendrogram (or cluster) algorithm places leaf nodes of the tree at the
same level
Polar coordinates instead of cartesian coordinates
http://homes.cs.washington.edu/~jheer//files/zoo/ex/hierarchies/cluster-radial.html
62. The adjacency diagram is a space-filling variant of the node-link
diagram; rather than drawing a link between parent and child in the
hierarchy, nodes are drawn as solid areas (either arcs or bars), and their
placement relative to adjacent nodes reveals their position in the
hierarchy
http://homes.cs.washington.edu/~jheer//files/zoo/ex/hierarchies/icicle.html
63. The sunburst layout, shown in figure 4E, is equivalent to the
icicle layout, but in polar coordinates.
http://homes.cs.washington.edu/~jheer//files/zoo/ex/hierarchies/sunburst.html
66. Enclosure diagrams use containment rather than adjacency to
represent the hierarchy
Squarified Treemaps - space filling
http://homes.cs.washington.edu/~jheer//files/zoo/ex/hierarchies/treemap.html
72. http://www.youtube.com/watch?v=wQpTM7ASc-w
T. Nagel, M. Maitan, E. Duval,A.Vande Moere, J. Klerkx, K. Kloeckl, and C. Ratti.Touching transport - a case study on visualizing metropolitan public
transit on interactive tabletops. In AVI2014: 12th ACM International Working Conference on AdvancedVisual Interfaces, pages 281–288, 2014.
74. Chord diagrams show directed relationships among a
group of entities. Relationship can be quantitative or
binary
http://bl.ocks.org/mbostock/4062006
Ye L, Amberg J, Chapman D et al. 2013 Fish gut
microbiota analysis differentiates physiology and
behavior of invasive Asian carp and indigenous American
fish The ISME journal
76. Choices of representation (e.g., matrix- diagram) and
interactive parameterization (e.g., default sort order) can be
critical to unearthing data quality issues that can otherwise
undermine accurate analysis.
86. Custom programming of a specific visualization component
Specific tools that use a chart typology (eg excel)
Data-flow graphs deconstruct the visualization process into fine-grained set of
operators (data import, transformation, layout, coloring, etc
Visualization construction using formal grammars (eg ggplot2, R, Prototvis, etc)
Data & View Specification controls serve to visually &
interactively encode data
95. • Categorical/ordinal data
• radio buttons, checkboxes, scrollable lists,
hierachies, search boxes (with autocomplete)
• Ordinal, quantitative, and temporal data
• a standard slider (for a single threshold value) or a
range slider (for specifying multiple endpoints).
Filtering allows rapid and reversible
exploration of data subsets
96. Heer,
J.,
&
Shneiderman,
B.
(2012,
February).
InteracMve
Dynamics
for
Visual
Analysis.
Magazine
Queue
-‐
Microprocessors
,
10
(2),
p.
30.
hTp://queue.acm.org/detail.cfm?id=2146416
100. Query controls can be further augmented with
visualizations of their own
101. Sorting enables popping up of
trends, clusters,…
• Choices in a toolbar
• Clicks on the header in a table
• Can be complicated in the case of multiple view
displays
104. Heer,
J.,
&
Shneiderman,
B.
(2012,
February).
InteracMve
Dynamics
for
Visual
Analysis.
Magazine
Queue
-‐
Microprocessors
,
10
(2),
p.
30.
hTp://queue.acm.org/detail.cfm?id=2146416
105. Select items to hightlight, filter or
manipulate them
• Mouse clicks, free-form lassos, area cursors
(‘brushes’), mouse hovering, etc
• depends on the device
• Various expressive power
• selections of a collection of items
• selections as queries over the data (eg drawing
rectangle -> range query)
108. Select by slope and tolerance
Heer,
J.,
&
Shneiderman,
B.
(2012,
February).
InteracMve
Dynamics
for
Visual
Analysis.
Magazine
Queue
-‐
Microprocessors
,
10
(2),
p.
30.
hTp://queue.acm.org/detail.cfm?id=2146416
109. Mapping mouse gestures to query patterns
Heer,
J.,
&
Shneiderman,
B.
(2012,
February).
InteracMve
Dynamics
for
Visual
Analysis.
Magazine
Queue
-‐
Microprocessors
,
10
(2),
p.
30.
hTp://queue.acm.org/detail.cfm?id=2146416
110. Navigate to examine high-
mede patterns & low-level detail
• Overview first, zoom & filter, then details-on-demand
• Start with what you know, then grow
• Search, show context, expand on demand.
• Focus + Context
• Semantic Zooming
• Magical lenses
111. $
f con-
cation
n this
of the
zoom
what
isible,
e user
on the
more
rectly
3, for
ber of
dmark
Figure 4: Setting of the evaluation.
B. Vandeputte, E. Duval, and J. Klerkx. Interactive sensemaking in authorship networks. Proceedings of the ACM International
Conference on Interactive Tabletops and Surfaces, ITS11, pp. 246–247, 2011.
Overview first, zoom and filter, details on demand
112. B. Vandeputte, E. Duval, and J. Klerkx. Applying design principles in authorship networks-a case study. In CHI EA’12:
Proceedings of the 2012 ACM annual conference extended abstracts on Human Factors in Computing Systems, pages 741–
744, 2012. (https://www.youtube.com/watch?v=R5CeTEejdBA)
Start with what you know, then grow
Search, show context, expand on demand
115. Focus + Context
Semantic Zooming
Heer,
J.,
&
Shneiderman,
B.
(2012,
February).
InteracMve
Dynamics
for
Visual
Analysis.
Magazine
Queue
-‐
Microprocessors
,
10
(2),
p.
30.
hTp://queue.acm.org/detail.cfm?id=2146416
116. Magical Lenses
C. Tominski, S. Gladisch, U. Kister, R. Dachselt, and H. Schumann. A Survey on Interactive Lenses in Visualization. EuroVis State-of-the-Art Reports, Swansea, UK, Eurographics
Association, 2014.
118. C. Tominski, S. Gladisch, U. Kister, R. Dachselt, and H. Schumann. A Survey on Interactive Lenses in
Visualization. EuroVis State-of-the-Art Reports, Swansea, UK, Eurographics Association, 2014.
119. Coordinate views for linked, multi-
dimensional exploration
Enables seeing data from different perspectives
Multiple views can facilitate comparison
125. Organize multiple windows & workspaces
• Tiled approaches (different widgets) allows to see
all information and selectors at once, minimizing
distracting scrolling or window operations, while
enabling analysts to concentrate on extracting and
reporting insights.
• Layout organization tools will become decisive
factors in creating effective user experience
126. Orchestrate attention and mentally integrate patterns among views
Heer,
J.,
&
Shneiderman,
B.
(2012,
February).
InteracMve
Dynamics
for
Visual
Analysis.
Magazine
Queue
-‐
Microprocessors
,
10
(2),
p.
30.
hTp://queue.acm.org/detail.cfm?id=2146416
131. Heer,
J.,
&
Shneiderman,
B.
(2012,
February).
InteracMve
Dynamics
for
Visual
Analysis.
Magazine
Queue
-‐
Microprocessors
,
10
(2),
p.
30.
hTp://queue.acm.org/detail.cfm?id=2146416
132. Heer,
J.,
&
Shneiderman,
B.
(2012,
February).
InteracMve
Dynamics
for
Visual
Analysis.
Magazine
Queue
-‐
Microprocessors
,
10
(2),
p.
30.
hTp://queue.acm.org/detail.cfm?id=2146416
133. Annotate patterns to document
findings
Record, organize, and communicate insights gained
during visual exploration
134. Freeform graphical annotations without explicit tie to the
underlying data
Data-aware annotations
Heer,
J.,
&
Shneiderman,
B.
(2012,
February).
InteracMve
Dynamics
for
Visual
Analysis.
Magazine
Queue
-‐
Microprocessors
,
10
(2),
p.
30.
hTp://queue.acm.org/detail.cfm?id=2146416
135.
136. Share views and annotations to
enable collaboration
Real-world analysis is very much a social process
that may involve multiple interpretations,
discussion, and dissemination of results.
141. Guide users through analysis tasks or
stories
• Incorporate guided analytics to lead analysts
through workflows for common tasks.
• Narrative visualization
142. Heer,
J.,
&
Shneiderman,
B.
(2012,
February).
InteracMve
Dynamics
for
Visual
Analysis.
Magazine
Queue
-‐
Microprocessors
,
10
(2),
p.
30.
hTp://queue.acm.org/detail.cfm?id=2146416
147. How much “big” data
can we visualise?
Petabytes? Exabytes? Yottabytes?!
Direct visualisation of big data ‘in the raw’ is
probably not so effective
148. Is there too much data to
visualize?
You only have so many pixels on a screen
Each pixel, one data point
E.g Typical hdtv screen contains 1920 * 1080 = 2.073.600 pixels
prysm.com
154. Cluttered displays
Binned density scatterplot
Hexagonal instead of rectangular
Heer, J. & Kandel, S. (2012), Interactive Analysis of Big Data, XRDS, 19 (1)
155. Perceptual scalability of a display
should be limited by the chosen
resolution of the data, not the numbers
of records (Heer & Kandel, 2012)
Heer, J. & Kandel, S. (2012), Interactive Analysis of Big Data, XRDS, 19 (1)
157. Visualizations might help reveal multidimensional patterns
Use the power of the machine to find a proxy in the data that
predicts the selected variables
Depending on their specific questions, domain experts might
select a subset of variables they are interested in
158. http://www.perceptualedge.com/blog/?p=2046
In this day of so-called Big Data,
organizations are scrambling to
implement new software and
hardware to increase the amount of
data that they collect and store.
In so doing they are unwittingly
making it harder to find the needles of
useful information in the rapidly
growing mounds of hay.
If you don’t know how to
differentiate signals from noise,
adding more noise only makes
matters worse.
Monday, June 1st, 2015
159. When we rely on data for decision making, how do we tell
what qualifies as a signal and what is merely noise?
In and of itself, data is neither. It is merely a collection of
facts.
When a fact is true, useful, and deserves a response, only
then is it a signal. When it isn’t, it’s noise. It’s that simple
(Few, http://www.perceptualedge.com/blog/?p=2046, 2015)
161. Visual-information Seeking Mantra
Visual Analytics Mantra
Overview First, Zoom and Filter, Details-on-Demand
Analyze First, Show the Important, Zoom and Analyse
Further, Details-on-Demand
Ben Shneiderman
Daniel Keim
162. Interactive analysis tools can help
quell “big data” by augmenting our
ability to manipulate and reason
about it (Heer & Kandel, 2012)
Heer, J. & Kandel, S. (2012), Interactive Analysis of Big Data, XRDS, 19 (1)
163. In the face of a data deluge, what remains
relatively constant is our own cognitive ability
to make sense of the data and reach reliable,
informed decisions. Big data is of little help
when decoupled from sound judgment.
J. Heer
Heer, J. & Kandel, S. (2012), Interactive Analysis of Big Data, XRDS, 19 (1)
164. Large data is a wild beast and you’d better treat it
with the right tools. Visualization is a great tool to
convey what automatic data analysis algorithms
discover. And often it is a very challenging task!
What the algorithms spit is exciting new complex
data that requires creativity and knowledge as well.
E. Bertini
http://fellinlovewithdata.com/guides/how-do-you-visualize-too-much-data
165. ▪ Interactive visualization of a million items
J.D. Fekete and C. Plaisant.
▪ Random Sampling as a Clutter Reduction Technique to Facilitate Interactive Visualisation of Large
Datasets
G. Ellis (part of it in collab. with yours truly).
▪ A Sampling Approach to Deal with Cluttered Information Visualizations
E. Bertini (my phd thesis).
▪ TreeJuxtaposer: Scalable Tree Comparison using Focus+Context with Guaranteed Visibility
T. Munzner, F. Guimbretiere, S. Tasiran, L. Zhang, and Y. Zhou.
▪ Beyond visual acuity: the perceptual scalability of information visualizations for large displays
B. Yost, Y. Haciahmetoglu, and C. North.
▪ Extreme visualization: squeezing a billion records into a million pixels
B. Shneiderman.
▪ Measuring Data Abstraction Quality in Multiresolution Visualization
Q. Cui, M. O. Ward, E. A. Rundensteiner, and J. Yang.
• imMens: Real-time Visual Querying of Big Data
Zhicheng Liu, Biye Jiang, Jeffrey Heer
Some papers about big data
visualisation
http://fellinlovewithdata.com/guides/how-do-you-visualize-too-much-data
173. BOOKS
• “Readings in InformationVisualization: UsingVision toThink”,
Card, S et al.
• “Signals”,“Now i see”,“Show Me the Numbers”, Few, S.
• “Beautiful Evidence”,Tufte, E.
• “InformationVisualization. Perception for design”,Ware, C.
• BeautifulVisualization: Looking at Data through the Eyes of
Experts (Theory in Practice): Julie Steele, Noah Iliinsky