Information graphics have been used for thousands of years to help illustrate ideas and communicate information. However, it requires skills and time to hand craft high-quality, customized information graphics for specific situations (e.g., data characteristics and user tasks). The problem becomes more acute when we must deal with big data. To address this problem, we are researching and developing mixed-initiative visual analytic systems that leverage both the intelligence of humans and machines to aid users in deriving insights from massive data. On the one hand, such a system automatically guides users to perform their data analytic tasks by recommending suitable visualization and discovery paths in context. On the other hand, users interactively explore, verify, and improve visual analytic results, which in turn helps the system to learn from users' behavior and improve its quality over time. In this talk, I will present key technologies that we have developed in building mixed-initiative visual analytic systems, including feature-based visualization recommendation and optimization-based approaches to dynamic data transformation for more effective visualization. I will also use concrete applications to demonstrate the use and value of mixed-initiative visual analytic systems, and discuss existing challenges and future directions in this area.
2. Outline
Definitions
– Big data
– Mixed-initiative visual analytics
Challenges and Goals
Our Approaches
– Key technologies
– Use cases
Future Directions
3. Variety
Definitions
“Big Picture”
Mixed-Initiative Visual Analytics of Big Data
Volume Velocity Veracity
210-million customers
10-billion transactions
850 TB of data
…
rumors
Incomplete data
…
100,000 tweets
684,478 FB shares
204 million emails
…
Per Minute
www.domo.com
4. Definitions
“Big Picture”
Mixed-Initiative Visual Analytics of Big Data
Here is your
customer
summary. I also
suggest …
Here is your
customer
summary. I also
suggest …
Tell me more
about my
customers
dougblakely.com
user-initiative
system-initiative
5. Key Challenges
“Tell me about … ”
– How to visually summarize large
volumes of heterogeneous data to
quickly discover meaningful
insights
“What do they mean?”
– How to visually explain
discovered insights (complex +
abstract) and guide exploration
“This does not look right, I
want to … do it again”
– How to allow users to correct
analytic results and adopt
previous analytic steps
www.gfi.com
big data
6. Combine advanced data analytics and interactive
visualization to help end users
Our Goals
Derive and consume insights
Explore various analytic paths and trust derived insights
Discover opportunities to compensate for and improve
insights and analytic processes
www.gfi.com
9. Analytic Requests
Output: Interactive Visualization
Data Analytic RecipesUser Models Analytic Engines
Alternative
Visualization
Visualization
Examples
Visual Analytics Concierge
“Big Picture”: Our Focus
Visualization
Recommender
Visualization
Recommender
Data
Transformer
Data
Transformer
Insight Revision
& Provenance
Insight Revision
& Provenance
10. Analytic Requests
Output: Interactive Visualization
Data Analytic RecipesUser Models Analytic Engines
Alternative
Visualization
Visualization
Examples
“Big Picture”: Our Focus
Visualization
Recommender
Visualization
Recommender
Data
Transformer
Data
Transformer
Insight Revision
& Provenance
Insight Revision
& Provenance
Visual Analytics Concierge
11. www.gfi.com
Data Transformation: Motivation
“Dirty”, noisy data
Large data variance
“Plain” raw data
Distorted, illegible
visualization
“Messy” visualization
without insights
Quality of data to be visualized affects the
quality of visualization
12. Example 1: “Dirty” Noisy Data
Original visualization After separating noise
Task: Show houses on a map
13. Example 2: Large Data Variance
Task: Summarize houses by styles and towns
Original visualization After normalization
14. Example 3: “Plain” Raw Data
Task: Correlate house price and towns under $1.5M
Original visualization After ordering towns
Price
Town
Ordered
15. Example 4: “Plain” Raw Data
Task: What is my emotional style?
After semantic-temporal segmentation [Pan et al. IUI 2013]
Original visualization
16. Technical Challenges
Determine proper data transformation for
different visualization situations
– Difficult to predict visualization situations involving
multiple factors: data, user, and types of visualization
Certain situations require multiple data
transformations
Balance multiple, potentially conflicting factors
– Quality of visualization and performance
17. Our Approach
Optimization-based approach to automatically derive data
transformations that maximize visualization quality
Original Data (D)
Data RetrievalData Retrieval Visualization
Generation
Visualization
Generation
Data
Transformer
DataData
TransformerTransformer
Transformed
Data
Visualization Type (Vt)
Input: Original data D, Visualization type Vt
Output: A set of transformation operators Op = {…, op[i], …}
where reward ∑ desirability(D, Vt , Op) is maximized
Visualization
Recommender
Visualization
Recommender
[Wen and Zhou IUI 2008, InfoVis 2008]
19. Data Transformation: What’s Next
What additional desirability metrics
should we consider?
How to perform data transformation in
context (incremental transformation)?
How to scale out to support exabytes of
data for different user tasks and
situations?
20. Analytic Requests
Output: Interactive Visualization
Data Analytic RecipesUser Models Analytic Engines
Alternative
Visualization
Visualization
Examples
Visual Analytics Concierge
“Big Picture”: Our Focus
Visualization
Recommender
Visualization
Recommender
Data
Transformer
Data
Transformer
Insight Revision
& Provenance
Insight Revision
& Provenance
22. Visualization Recommendation: Types
www.gfi.com
Data-driven recommendation
– Dynamically recommend suitable
visualizations based on data, display, and
user tasks
Behavior-driven recommendation
– Dynamically track user interactions and
detect behavior patterns to recommend
suitable visualizations in context
DisplayDisplay + new data + task
DisplayDisplay + user behavior
23. Two situations
– Single display
– Multiple, consecutive
displays
Multiple methods
– Rule based
– Planning based
– Machine learning based
Visualization
Recommendation
Visualization
Recommendation
Adopted from [Roth et al., CHI 94]
[Mackinlay ’86; Roth & Mattis CHI ’94; Zhou & Feiner InfoVis 96; Zhou & Feiner CHI 98; Zhou
IJCAI 99; Zhou & Chen InfoVis 02; Zhou & Chen IJCAI03; Wen & Zhou InfoVis 05]
Data-Driven Visualization
Recommendation
24. Behavior-Driven Visualization
Recommendation
Observation
– Users tend to stay with unsuitable visualization or
compensate for with large number of interactions
instead of changing visualization
Goal
– Detect user interaction patterns and make pattern-based
visualizations recommendations
27. Behavior-Driven Visualization
Recommendation: Example 2
Display: Map of the Market
User interactions: repeatedly change time windows for two industries
Time Window:
26 weeks
52 weeks
…
Time Window:
26 weeks
52 weeks
…
28. Behavior-Driven Visualization
Recommendation: Example 2
Pattern: Flip
Visualization Recommendation: line chart for direct trend comparison
-10
-5
0
5
10
15
20
10 20 30 40 50
Utility
Netw orking
[Gotz and Wen IUI 2009]
29. Behavior-Driven Visualization
Recommendation: Pattern-Based Approach
Pattern
Detection
Pattern
Detection
Pattern-Task
Matching
Pattern-Task
Matching
Pattern-Data
Matching
Pattern-Data
Matching
Example
Match
Example
Match
Visualization
Recommendations
Task Features
Data Features
User Interactions
[Gotz and Wen IUI 2009]
Scan
Flip
Swap
…
30. Recommending Visual Interactions
Automatically annotate and suggest follow-on user
interactions based on displayed visual features
Original display Annotated display
A
B
[Kandogan VAST’2012]
32. Visualization Recommendation:
What’s Next
Recommend a suitable heterogeneous
visualization as a consecutive display
Recommend the composition of two or more
existing visualizations
+ = ?
Vehicle Group Vehicle Age
Cost
?
?
[Wen, Zhou & Aggarwal, InfoVis
05; Heer & Robertson, InfoVis07]
[Yang , Li, & Zhou 2013]
33. Visualization Recommendation:
What’s Next
“Individualized” (hyper-
personalized), adpative
visualization
– By cognitive style and personality
[Gardner 1983]
– By one’s emotional/affective states
inventive/curious vs.
consistent/cautious
friendly/compassionate
vs. cold/unkind
outgoing/energetic
vs. solitary/reserved
efficient/organized vs.
easy-going/careless
Sadness Optimism Trust
sensitive/nervous vs.
secure/confident
O
C
E
A
N
Big 5 Personality Model
34. Analytic Requests
Output: Interactive Visualization
Data Analytic RecipesUser Models Analytic Engines
Alternative
Visualization
Visualization
Examples
Visual Analytics Concierge
“Big Picture”: Our Focus
Visualization
Recommender
Visualization
Recommender
Data
Transformer
Data
Transformer
Insight Revision
& Provenance
Insight Revision
& Provenance
35. Insight Revision and Provenance
Insight revision
– Users amend derived insights
to correct analytic mistakes
or make personalized
adjustments
Insight provenance
– Users record interactions and
insight for continuation and
reuse
Neuroticism (high low)
Extroversion (low high)
Zoom In2 Edit2 Query Filter
User Actions
36. Insight Revision
A crowd-powered approach to insight revision
– Users amend various types of text analytics mistakes
– Adopting multi-user consistent inputs
Correcting sentiment
classification error
[Hu et al. INTERACT 2013]
37. Insight Revision
A crowd-powered approach to insight revision
– Users amend various types of text analytics mistakes
– Adopting multi-user consistent inputs
Correcting summarization label error
[Hu et al. INTERACT 2013]
38. Insight Provenance
[Gotz and Zhou InfoVis 2009]
An action-based approach to insight provenance
– “Actions” captures observable and semantically
meaningful user interactions
• Three types of actions: Exploration | Insight | Meta
– “Action trails” captures sequence of actions leading to an
insight for insight provenance
InsightnExploratio
AA o+
= ])([ ττ
40. Insight Revision and Provenance:
What’s Next
Balance crowd input and personalized
adjustments
– Reconcile diverse user amendments vs. prevent potential
system abuse
Detect and learn different types of logical
structures from user interactions
– Automatically infer and predict user interaction patterns
to better support and anticipate user tasks
?
41. Summary
Tell me what’s in
my data
Tell me what’s in
my data
Here is the “big
picture” of your
data. I also suggest
you look into …
Here is the “big
picture” of your
data. I also suggest
you look into …
dougblakely.com
“Big data” is of high volume, heterogeneous, and often “dirty”
It requires both users and computers to take initiatives for
effective visual analytics of big data
Something is
wrong… should be…
Remember what I
have done so far
Something is
wrong… should be…
Remember what I
have done so far I incorporated your
and others’ feedback.
Please continue …
I incorporated your
and others’ feedback.
Please continue …
Data
Transformation
Data
Transformation
Visualization
Recommendation
Visualization
Recommendation
Insight Revision
& Provenance
Insight Revision
& Provenance
42. Acknowledgements
IBM Research, Almadn
– Eser Kandogan, Fei Wang, Huahai Yang, Liang Gou, Ying Xuan, Eben Haber,
Yunyao Li
IBM T. J. Watson
– David Gotz, Zhen Wen, Shimei Pan, Jie Lu, Min Chen*, Sheng Ma*, Peter
Kissa*, Vikram Aggarwal*
IBM Research, China
– Shixia Liu*, Nan Cao, Yangqiu Song* Weihong Qian
Summer interns
– Ying Feng (Indiana University)
– Basak Alper (UC Santa Barbara)
– Mengdie Hu (Georgia Tech)
– Jian Zhao (University of Toronto)
43. References of Our Work
David Gotz and Michelle X. Zhou: Characterizing users' visual analytic activity for insight provenance.
Information Visualization 8(1): 42-55, 2009.
David Gotz and Zhen Wen: Behavior-driven visualization recommendation. IUI 2009: 315-324, 2008.
Eser Kandogan: Just-in-time annotation of clusters, outliers, and trends in point-based data
visualizations. IEEE VAST 2012: 73-82.
Mengdie Hu, Huahai Yang, Michelle X. Zhou, Liang Gou, Yunyao Li, and Eben Haber: OpinionBlocks: A
Crowd-Powered, Self-Improving Interactive Visual Analytic System for Understanding Opinion Text.
To appear in Proc. INTERACT 2013.
Zhen Wen and Michelle X. Zhou: Evaluating the Use of Data Transformation for Information
Visualization. IEEE Trans. Vis. Comp. Graph. 14(6): 1309-1316, 2008.
Zhen Wen and Michelle X. Zhou: An optimization-based approach to dynamic data transformation for
smart visualization. IUI 2008: 70-79
Zhen Wen, Michelle X. Zhou, and Vikram Aggarwal: An Optimization-based Approach to Dynamic
Visual Context Management. INFOVIS 2005: 25-32.
Huahai Yang, Yunyao Li, and Michelle X. Zhou: A Crowd-sourced Study: Understanding Users’
Comprehension and Preferences for Composing Information Graphics. In Submission to TOCHI 2013.
Michelle X. Zhou and Min Chen: Automated Generation of Graphic Sketches by Example. IJCAI 2003:
65-74
Michelle X. Zhou, Min Chen, and Ying Feng: Building a Visual Database for Example-based Graphics
Generation. INFOVIS 2002: 23-30.
Michelle X. Zhou, Sheng Ma, and Ying Feng: Applying machine learning to automated information
graphics generation. IBM Systems Journal 41(3): 504-523 (2002)