SlideShare una empresa de Scribd logo
1 de 74
Descargar para leer sin conexión
Interactive Latency in
Big Data Visualization
Zhicheng “Leo” Liu
Jan 22, 2014
Latency:
a measure of time delay experienced in a system
rotational latency
network latency
query latency
interactive latency
Questions
How to reduce interactive latency in big data visualization?
How does interactive latency affect user behavior?
Questions
How to reduce interactive latency in big data visualization?
How does interactive latency affect user behavior?
Reducing Latency
More memory
in-memory data store
Clever indexing
cube representation schemes
Parallel processing
multicore, GPGPU, distributed platforms
imMens: a holistic approach
Perceptual scalability
Binned aggregation as primary data reduction strategy
Interactive scalability
Multivariate data tiles
Parallel query processing and rendering on GPU
[Liu et. al. 2013]
imMens: a holistic approach
Perceptual scalability
Binned aggregation as primary data reduction strategy
Interactive scalability
Multivariate data tiles
Parallel query processing and rendering on GPU
[Liu et. al. 2013]
Guiding Principle
Perceptual & interactive scalability should be limited
by the chosen resolution of the visualized data,
not the number of records.
10	
  
Data
11	
  
Data
Alpha-blending
12	
  
Data
13	
  
Data Sampling
14	
  
Data Sampling
Modeling
15	
  
Data Sampling
Modeling Binned Aggregation
Google Fusion Tables: Sampling
16	
  
Sampling
17	
  
Aggregation
Binned Plots: Design Space
18	
  
numeric	
   ordinal/categorical	
   temporal	
   geographic	
  
1D	
  
2D	
  
imMens: a holistic approach
Perceptual scalability
Binned aggregation as primary data reduction strategy
Interactive scalability
Multivariate data tiles
Parallel query processing and rendering on GPU
[Liu et. al. 2013]
Demo
Multivariate Data Tiles
21
Projections / Materialized database views
Provide data for dynamic visualization
Much faster than a traditional data cube
22	
  
Brush & Link: A Naïve Approach
23	
  
X!
Y!
256
…
767
512 1023…
Day!
Hour!
Month!
23
…
0 1 … 30
0
…
11
1
23
…
0
…
11
0 1 … 30 0 1 … 30
0
23
…
0
11
1
0
…
1
0
12 x 31 x 24 x 512 x 512 = ~2.3 billion cells
Brushing Over January
24	
  
X!
Y!
256
…
767
512 1023…
Day!
Hour!
Month!
23
…
0 1 … 30
0
…
11
1
23
…
0
…
11
0 1 … 30 0 1 … 30
0
23
…
0
11
1
0
…
1
0
31 x 24 x 512 x 512 = ~195 million cells
Sum Along Day
25	
  
X!
Y!
256
…
767
512 1023…
[ 0 – 30 ]
Day!
Hour!
Month!
23
…
0
…
11
1
23
…
0
…
11
[ 0 – 30 ] [ 0 - 30 ]
0
23
…
0
11
1
0
…
1
0
24 x 512 x 512 = ~6 million cells
Sum Along Hour
26	
  
X!
Y!
256
…
767
512 1023…
[ 0 – 30 ]
Day!
Hour!
Month!
[ 0 – 23 ]
0
…
11
0
…
11
[ 0 – 30 ] [ 0 - 30 ]
[ 0 – 23 ]
0
11
…
[ 0 – 23 ]
512 x 512 cells
Decomposing a Data Cube
27	
  
For any pair of 1D or 2D binned plots, the
maximum number of dimensions needed
to support brushing & linking is 4.
full 5-D cube!
Day!
Hour!
Month!
0 1 … 30
0
…
11
Y!
Hour!
X!
512 513 … 1023
256
…
767
Y!
Day!
X!
512 513 … 1023
256
…
767
Y!
Month!
X!
512 513… 1023
256
…
767
3-D !
cubes!
23
…
1
0
23
…
1
0
30
…
1
0
11
…
1
0
Σ	
   Σ	
   Σ	
   Σ	
  
28	
  
Tiles
29	
  
X: 256-511 X: 512-767
Y:512-767Y:768-1023
Day: 31 bins
Y:	
  512	
  -­‐	
  1023	
  
day:	
  	
  0	
  -­‐	
  31	
  
From Datacube to Data Tiles
30	
  
512 513 … 767
256
…
511
30
…
1
0
512 513 … 767
512
…
767
30
…
1
0
768 769 … 1023
256
…
511
30
…
1
0
768 769 … 1023
512
…
767
30
…
1
0
Data Tiles
31	
  
x1-y1-month
32	
  
x1-y1-day
33	
  
x1-y1-hour
34	
  
x1-y2-month
35	
  
x1-y2-day
36	
  
x1-y2-hour
37	
  
x2-y1-month
38	
  
x2-y1-day
39	
  
x2-y1-hour
40	
  
x2-y2-month
41	
  
x2-y2-day
42	
  
x2-y2-hour
43	
  
month-day-hour
44	
  
45	
  
imMens Architecture
46	
  SciDB,	
  Postgres	
  
Client	
  
Server	
  
UI	
  control	
   VisualizaHon	
  
specify	
  
brush	
  	
  
&	
  link	
  
zoom	
  &	
  pan	
  
Client-Side Processing
47	
  
0
1
…
11
768 769 … 1023
512
513
…
767
R	
   G	
   B	
   A	
  
R	
   G	
   B	
   A	
  
…	
   …	
   …	
   …	
  
R	
   G	
   B	
   A	
  
data	
  Hles	
  
query	
  
fragment	
  
shader	
  
Y	
  [768-­‐1023]	
  
X	
  [512-­‐767]	
  
{	
  0
1
…
11
Pass	
  1	
  
projecHons	
   off-­‐screen	
  FBO	
  
render	
  
fragment	
  
shader	
  
Pass	
  2	
  
canvas	
  
Pack	
  data	
  Hles	
  as	
  images	
  (352KB	
  for	
  Brightkite)	
  
Bind	
  to	
  WebGL	
  context	
  as	
  textures	
  	
  
48	
  
Simulate brush & linking across
plots in a scatter plot matrix
imMens vs. full data cube
60 synthesized datasets
Parameters
bin count per dimension
(10,20,30,40,50)
number of records
(10K, 100K, 1M, 10M, 100M, 1B)
number of dimensions (4,5)
Performance Benchmarks
49	
  
Google Chrome v.23.0.1271.95 on a quad-core 2.3 GHz MacBook Pro (OS X 10.8.2) with per-core 256K L2
caches, shared 6MB L3 cache and 8GB RAM. PCI Express NVIDIA GeForce GT 650M graphics card with
1024MB video RAM.
51.9	
   52.3	
   51.6	
   52.0	
   53.2	
   52.1	
  
5.5	
  
3.0	
   2.2	
  
50	
  
Google Chrome v.23.0.1271.95 on a quad-core 2.3 GHz MacBook Pro (OS X 10.8.2) with per-core 256K L2
caches, shared 6MB L3 cache and 8GB RAM. PCI Express NVIDIA GeForce GT 650M graphics card with
1024MB video RAM.
51.9	
   52.3	
   51.6	
   52.0	
   53.2	
   52.1	
  
5.5	
  
3.0	
   2.2	
  
51	
  
Google Chrome v.23.0.1271.95 on a quad-core 2.3 GHz MacBook Pro (OS X 10.8.2) with per-core 256K L2
caches, shared 6MB L3 cache and 8GB RAM. PCI Express NVIDIA GeForce GT 650M graphics card with
1024MB video RAM.
51.9	
   52.3	
   51.6	
   52.0	
   53.2	
   52.1	
  
5.5	
  
3.0	
   2.2	
  
50fps querying and
rendering of 1B data points
Speed of Thought?
Questions
How to reduce interactive latency in big data visualization?
How does interactive latency affect user behavior?
Newell (1994): Unified Theories of Cognition
Newell (1994) Card et al (1983) Example Time Range
deliberate act perceptual fusion recognize a pattern,
track animation
~100 milliseconds
cognitive operation unprepared response click a link,
select an object
~1 second
unit task unit task edit a line of text,
make a chess move
~10 seconds
~300ms: The Embodiment Level
Deictic Strategy
Pointing movements bind objects in the world
Small changes in cost of binding
cause different cognitive behavior
Latency affects high-level/longitudinal strategies
Block-copying
Ballard et al (1995, 1997)
8-puzzle solving
O’Hara and Payne (1998, 1999)
Search
Brutlag (2009)
Exploratory Visual Analysis?
Operation Low High
brush & link ~20ms ~20ms + 500ms
select ~20ms ~20ms + 500ms
pan ~100ms ~100ms + 500ms
zoom ~1000ms ~1000ms + 500ms
Latency Conditions
Datasets
Study Design
16 participants, 32 observations
2 X 2 between subject
interaction logs
audio transcripts
Log Events
System and Mouse Events
brush, select, zoom, pan, clear, color slider, log scale
tiles cached,
mouse down, mouse up, mouse move
Trigger vs. Processed System Events
debouncing keeps system usable
timestamp, event type, parameters
Normalized Processed Events
How to Evaluate Performance?
The purpose of visualization is insight,
not pictures.
Counting Insights
What is an insight?
"many new airlines emerged around year 2003”
"HP started in 2001, AS in 2003, PI in 2004, OH in 2003”
“OH started in 2003, and they are doing pretty well
in terms of delays”
Questions
How to reduce interactive latency in big data visualization?
imMens: a system supporting real-time interaction
binned aggregation for perceptual scalability
multivariate data tiles & GPU processing for low latency
How does interactive latency affect user behavior?
Comparative study: quantitative & qualitative analysis
Questions
How to reduce interactive latency in big data visualization?
imMens: a system supporting real-time interaction
binned aggregation for perceptual scalability
multivariate data tiles & GPU processing for low latency
How does interactive latency affect user behavior?
Questions
How to reduce interactive latency in big data visualization?
imMens: a system supporting real-time interaction
binned aggregation for perceptual scalability
multivariate data tiles & GPU processing for low latency
How does interactive latency affect user behavior?
User study: quantitative & qualitative analysis
Acknowledgment
Jeffrey Heer
Biye Jiang
Thank You

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

What Are Science Clouds?
What Are Science Clouds?What Are Science Clouds?
What Are Science Clouds?
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Data Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud AutomationData Tribology: Overcoming Data Friction with Cloud Automation
Data Tribology: Overcoming Data Friction with Cloud Automation
 
Health & Status Monitoring (2010-v8)
Health & Status Monitoring (2010-v8)Health & Status Monitoring (2010-v8)
Health & Status Monitoring (2010-v8)
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
 
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler..."Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
"Quantum clustering - physics inspired clustering algorithm", Sigalit Bechler...
 
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationThe Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
 
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
Materials Data Facility: Streamlined and automated data sharing,  discovery, ...Materials Data Facility: Streamlined and automated data sharing,  discovery, ...
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
 
The Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of ScienceThe Open Science Data Cloud: Empowering the Long Tail of Science
The Open Science Data Cloud: Empowering the Long Tail of Science
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 
Large Scale On-Demand Image Processing For Disaster Relief
Large Scale On-Demand Image Processing For Disaster ReliefLarge Scale On-Demand Image Processing For Disaster Relief
Large Scale On-Demand Image Processing For Disaster Relief
 
Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science Research
 
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
Feature Extraction Based Estimation of Rain Fall By Cross Correlating Cloud R...
 
Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)Open Science Data Cloud (IEEE Cloud 2011)
Open Science Data Cloud (IEEE Cloud 2011)
 
Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept
 
My Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataMy Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big Data
 
Continuous and Parallel LiDAR Point-cloud Clustering
Continuous and Parallel LiDAR Point-cloud ClusteringContinuous and Parallel LiDAR Point-cloud Clustering
Continuous and Parallel LiDAR Point-cloud Clustering
 
Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"
 
An Overview of Bionimbus (March 2010)
An Overview of Bionimbus (March 2010)An Overview of Bionimbus (March 2010)
An Overview of Bionimbus (March 2010)
 

Destacado

Destacado (10)

Visualization of Big Data in Web Apps
Visualization of Big Data in Web AppsVisualization of Big Data in Web Apps
Visualization of Big Data in Web Apps
 
Microservices, The Basic Math
Microservices, The Basic MathMicroservices, The Basic Math
Microservices, The Basic Math
 
Big data visualization framework
Big data visualization frameworkBig data visualization framework
Big data visualization framework
 
Electrical energy auditing
Electrical energy auditingElectrical energy auditing
Electrical energy auditing
 
Visualizing big data in the browser using spark
Visualizing big data in the browser using sparkVisualizing big data in the browser using spark
Visualizing big data in the browser using spark
 
Qlik vs. Tableau: High-Level Comparison
Qlik vs. Tableau: High-Level ComparisonQlik vs. Tableau: High-Level Comparison
Qlik vs. Tableau: High-Level Comparison
 
Big Data Startups - Top Visualization and Data Analytics Startups
Big Data Startups - Top Visualization and Data Analytics StartupsBig Data Startups - Top Visualization and Data Analytics Startups
Big Data Startups - Top Visualization and Data Analytics Startups
 
Sparkly Notebook: Interactive Analysis and Visualization with Spark
Sparkly Notebook: Interactive Analysis and Visualization with SparkSparkly Notebook: Interactive Analysis and Visualization with Spark
Sparkly Notebook: Interactive Analysis and Visualization with Spark
 
Microservices Architectures: Become a Unicorn like Netflix, Twitter and Hailo
Microservices Architectures: Become a Unicorn like Netflix, Twitter and HailoMicroservices Architectures: Become a Unicorn like Netflix, Twitter and Hailo
Microservices Architectures: Become a Unicorn like Netflix, Twitter and Hailo
 
Big Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and ZeppelinBig Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and Zeppelin
 

Similar a Interactive Latency in Big Data Visualization

Petascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big AnalyticsPetascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big Analytics
Heiko Joerg Schick
 
Qiu bosc2010
Qiu bosc2010Qiu bosc2010
Qiu bosc2010
BOSC 2010
 
Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...
Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...
Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...
Deltares
 

Similar a Interactive Latency in Big Data Visualization (20)

Petascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big AnalyticsPetascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big Analytics
 
Agents In An Exponential World Foster
Agents In An Exponential World FosterAgents In An Exponential World Foster
Agents In An Exponential World Foster
 
Making Machine Learning Scale: Single Machine and Distributed
Making Machine Learning Scale: Single Machine and DistributedMaking Machine Learning Scale: Single Machine and Distributed
Making Machine Learning Scale: Single Machine and Distributed
 
Computing Outside The Box June 2009
Computing Outside The Box June 2009Computing Outside The Box June 2009
Computing Outside The Box June 2009
 
陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰陸永祥/全球網路攝影機帶來的機會與挑戰
陸永祥/全球網路攝影機帶來的機會與挑戰
 
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
 
E Science As A Lens On The World Lazowska
E Science As A Lens On The World   LazowskaE Science As A Lens On The World   Lazowska
E Science As A Lens On The World Lazowska
 
E Science As A Lens On The World Lazowska
E Science As A Lens On The World   LazowskaE Science As A Lens On The World   Lazowska
E Science As A Lens On The World Lazowska
 
Dice presents-feb2014
Dice presents-feb2014Dice presents-feb2014
Dice presents-feb2014
 
Qiu bosc2010
Qiu bosc2010Qiu bosc2010
Qiu bosc2010
 
Deep Learning Initiative @ NECSTLab
Deep Learning Initiative @ NECSTLabDeep Learning Initiative @ NECSTLab
Deep Learning Initiative @ NECSTLab
 
Intelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiencyIntelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiency
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
Next Generation Indexes For Big Data Engineering (ODSC East 2018)
Next Generation Indexes For Big Data Engineering (ODSC East 2018)Next Generation Indexes For Big Data Engineering (ODSC East 2018)
Next Generation Indexes For Big Data Engineering (ODSC East 2018)
 
Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...
Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...
Dsd int 2014 - data science symposium - application 1 - point clouds, prof. p...
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
 
USENIX NSDI 2016 (Session: Resource Sharing)
USENIX NSDI 2016 (Session: Resource Sharing)USENIX NSDI 2016 (Session: Resource Sharing)
USENIX NSDI 2016 (Session: Resource Sharing)
 
Malstone KDD 2010
Malstone KDD 2010Malstone KDD 2010
Malstone KDD 2010
 
Visual, Interactive, Predictive Analytics for Big Data
Visual, Interactive, Predictive Analytics for Big DataVisual, Interactive, Predictive Analytics for Big Data
Visual, Interactive, Predictive Analytics for Big Data
 
Intelligent Monitoring
Intelligent MonitoringIntelligent Monitoring
Intelligent Monitoring
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Interactive Latency in Big Data Visualization

  • 1. Interactive Latency in Big Data Visualization Zhicheng “Leo” Liu Jan 22, 2014
  • 2. Latency: a measure of time delay experienced in a system rotational latency network latency query latency interactive latency
  • 3.
  • 4. Questions How to reduce interactive latency in big data visualization? How does interactive latency affect user behavior?
  • 5. Questions How to reduce interactive latency in big data visualization? How does interactive latency affect user behavior?
  • 6. Reducing Latency More memory in-memory data store Clever indexing cube representation schemes Parallel processing multicore, GPGPU, distributed platforms
  • 7. imMens: a holistic approach Perceptual scalability Binned aggregation as primary data reduction strategy Interactive scalability Multivariate data tiles Parallel query processing and rendering on GPU [Liu et. al. 2013]
  • 8. imMens: a holistic approach Perceptual scalability Binned aggregation as primary data reduction strategy Interactive scalability Multivariate data tiles Parallel query processing and rendering on GPU [Liu et. al. 2013]
  • 9. Guiding Principle Perceptual & interactive scalability should be limited by the chosen resolution of the visualized data, not the number of records.
  • 15. 15   Data Sampling Modeling Binned Aggregation
  • 16. Google Fusion Tables: Sampling 16   Sampling
  • 18. Binned Plots: Design Space 18   numeric   ordinal/categorical   temporal   geographic   1D   2D  
  • 19. imMens: a holistic approach Perceptual scalability Binned aggregation as primary data reduction strategy Interactive scalability Multivariate data tiles Parallel query processing and rendering on GPU [Liu et. al. 2013]
  • 20. Demo
  • 21. Multivariate Data Tiles 21 Projections / Materialized database views Provide data for dynamic visualization Much faster than a traditional data cube
  • 22. 22  
  • 23. Brush & Link: A Naïve Approach 23   X! Y! 256 … 767 512 1023… Day! Hour! Month! 23 … 0 1 … 30 0 … 11 1 23 … 0 … 11 0 1 … 30 0 1 … 30 0 23 … 0 11 1 0 … 1 0 12 x 31 x 24 x 512 x 512 = ~2.3 billion cells
  • 24. Brushing Over January 24   X! Y! 256 … 767 512 1023… Day! Hour! Month! 23 … 0 1 … 30 0 … 11 1 23 … 0 … 11 0 1 … 30 0 1 … 30 0 23 … 0 11 1 0 … 1 0 31 x 24 x 512 x 512 = ~195 million cells
  • 25. Sum Along Day 25   X! Y! 256 … 767 512 1023… [ 0 – 30 ] Day! Hour! Month! 23 … 0 … 11 1 23 … 0 … 11 [ 0 – 30 ] [ 0 - 30 ] 0 23 … 0 11 1 0 … 1 0 24 x 512 x 512 = ~6 million cells
  • 26. Sum Along Hour 26   X! Y! 256 … 767 512 1023… [ 0 – 30 ] Day! Hour! Month! [ 0 – 23 ] 0 … 11 0 … 11 [ 0 – 30 ] [ 0 - 30 ] [ 0 – 23 ] 0 11 … [ 0 – 23 ] 512 x 512 cells
  • 27. Decomposing a Data Cube 27   For any pair of 1D or 2D binned plots, the maximum number of dimensions needed to support brushing & linking is 4. full 5-D cube! Day! Hour! Month! 0 1 … 30 0 … 11 Y! Hour! X! 512 513 … 1023 256 … 767 Y! Day! X! 512 513 … 1023 256 … 767 Y! Month! X! 512 513… 1023 256 … 767 3-D ! cubes! 23 … 1 0 23 … 1 0 30 … 1 0 11 … 1 0 Σ   Σ   Σ   Σ  
  • 28. 28  
  • 29. Tiles 29   X: 256-511 X: 512-767 Y:512-767Y:768-1023 Day: 31 bins
  • 30. Y:  512  -­‐  1023   day:    0  -­‐  31   From Datacube to Data Tiles 30   512 513 … 767 256 … 511 30 … 1 0 512 513 … 767 512 … 767 30 … 1 0 768 769 … 1023 256 … 511 30 … 1 0 768 769 … 1023 512 … 767 30 … 1 0
  • 45. 45  
  • 46. imMens Architecture 46  SciDB,  Postgres   Client   Server   UI  control   VisualizaHon   specify   brush     &  link   zoom  &  pan  
  • 47. Client-Side Processing 47   0 1 … 11 768 769 … 1023 512 513 … 767 R   G   B   A   R   G   B   A   …   …   …   …   R   G   B   A   data  Hles   query   fragment   shader   Y  [768-­‐1023]   X  [512-­‐767]   {  0 1 … 11 Pass  1   projecHons   off-­‐screen  FBO   render   fragment   shader   Pass  2   canvas   Pack  data  Hles  as  images  (352KB  for  Brightkite)   Bind  to  WebGL  context  as  textures    
  • 48. 48   Simulate brush & linking across plots in a scatter plot matrix imMens vs. full data cube 60 synthesized datasets Parameters bin count per dimension (10,20,30,40,50) number of records (10K, 100K, 1M, 10M, 100M, 1B) number of dimensions (4,5) Performance Benchmarks
  • 49. 49   Google Chrome v.23.0.1271.95 on a quad-core 2.3 GHz MacBook Pro (OS X 10.8.2) with per-core 256K L2 caches, shared 6MB L3 cache and 8GB RAM. PCI Express NVIDIA GeForce GT 650M graphics card with 1024MB video RAM. 51.9   52.3   51.6   52.0   53.2   52.1   5.5   3.0   2.2  
  • 50. 50   Google Chrome v.23.0.1271.95 on a quad-core 2.3 GHz MacBook Pro (OS X 10.8.2) with per-core 256K L2 caches, shared 6MB L3 cache and 8GB RAM. PCI Express NVIDIA GeForce GT 650M graphics card with 1024MB video RAM. 51.9   52.3   51.6   52.0   53.2   52.1   5.5   3.0   2.2  
  • 51. 51   Google Chrome v.23.0.1271.95 on a quad-core 2.3 GHz MacBook Pro (OS X 10.8.2) with per-core 256K L2 caches, shared 6MB L3 cache and 8GB RAM. PCI Express NVIDIA GeForce GT 650M graphics card with 1024MB video RAM. 51.9   52.3   51.6   52.0   53.2   52.1   5.5   3.0   2.2   50fps querying and rendering of 1B data points
  • 52.
  • 54. Questions How to reduce interactive latency in big data visualization? How does interactive latency affect user behavior?
  • 55. Newell (1994): Unified Theories of Cognition
  • 56. Newell (1994) Card et al (1983) Example Time Range deliberate act perceptual fusion recognize a pattern, track animation ~100 milliseconds cognitive operation unprepared response click a link, select an object ~1 second unit task unit task edit a line of text, make a chess move ~10 seconds
  • 58. Deictic Strategy Pointing movements bind objects in the world
  • 59. Small changes in cost of binding cause different cognitive behavior
  • 60. Latency affects high-level/longitudinal strategies Block-copying Ballard et al (1995, 1997) 8-puzzle solving O’Hara and Payne (1998, 1999) Search Brutlag (2009)
  • 62. Operation Low High brush & link ~20ms ~20ms + 500ms select ~20ms ~20ms + 500ms pan ~100ms ~100ms + 500ms zoom ~1000ms ~1000ms + 500ms Latency Conditions
  • 64. Study Design 16 participants, 32 observations 2 X 2 between subject interaction logs audio transcripts
  • 65. Log Events System and Mouse Events brush, select, zoom, pan, clear, color slider, log scale tiles cached, mouse down, mouse up, mouse move Trigger vs. Processed System Events debouncing keeps system usable timestamp, event type, parameters
  • 67. How to Evaluate Performance? The purpose of visualization is insight, not pictures.
  • 69. What is an insight? "many new airlines emerged around year 2003” "HP started in 2001, AS in 2003, PI in 2004, OH in 2003” “OH started in 2003, and they are doing pretty well in terms of delays”
  • 70. Questions How to reduce interactive latency in big data visualization? imMens: a system supporting real-time interaction binned aggregation for perceptual scalability multivariate data tiles & GPU processing for low latency How does interactive latency affect user behavior? Comparative study: quantitative & qualitative analysis
  • 71. Questions How to reduce interactive latency in big data visualization? imMens: a system supporting real-time interaction binned aggregation for perceptual scalability multivariate data tiles & GPU processing for low latency How does interactive latency affect user behavior?
  • 72. Questions How to reduce interactive latency in big data visualization? imMens: a system supporting real-time interaction binned aggregation for perceptual scalability multivariate data tiles & GPU processing for low latency How does interactive latency affect user behavior? User study: quantitative & qualitative analysis