SlideShare una empresa de Scribd logo
1 de 27
Jaipaul Agonus & Daniel Monteiro, FINRA Technology
SCALING VISUALIZATION
FOR BIG DATA AND ANALYTICS
IN THE CLOUD
1
Investor
Protection
Market
Integrity
2
Brokers
12
Firms
3,800 634,000
Markets/
Exchanges
Events on average
each day
37 billion
3
of storageevents per day
25+pb 100s
of nodes and
edges
of surveillance
programs
trillions100 billion
5000+
running instances
150+
applications
Up to
Session Takeaways
• How FINRA Leverages AWS Infrastructure
• Scaling Techniques for Cloud Resources
• Data Visualization Principles
• Challenges and Strategies in Big Data Visualization
• Data Visualization in Market Surveillance
4
7
Surveillance Analyst Needs
Interactive Data Access
• Drill down into data by using
hierarchical datasets models
• Export datasets to excel using
custom formatting.
Visual Analysis
• Highlight interesting events and
outliers
• Support for contextual visualization
Review and Feedback
• Workflow for reviewing datasets
• Support for adding feedback “tags”
and comments on datasets
Visual display of data plays a
fundamental role in articulating ideas
and knowledge.
The ability to “see” the data can enhance
and transform our perception of it.
Scaling data visualization is not only
about displaying more pixels, but the
right ones, in the right context.
“15000 Galaxies In One Image.”
Image credit: NASA, esa, p. oesch of the university of Geneva, and M. Montes of the
university of New South Wales.
12
13
Anaximander ( ~500 BC)
John Mansley Robinson,
An Introduction to Early Greek Philosophy,
Houghton and Mifflin, 1968.
Mercator (1570)
Atlas of Europe, British Library
Google maps (2019)
google.com
“A map does not just chart, it unlocks and formulates meaning; it forms bridges between here
and there, between disparate ideas that we did not know were previously connected.”
Reif Larsen, The Selected Works of T.S. Spivet
14
“And what is the use of a book,” thought Alice, “without pictures or conversations?”
Alice in Wonderland
Insights through visually apparent patterns and trends
Problem
• Question
• Theory
• Goal
Model
• Algorithms
• Experimentation
• Validation
Data
• Collect
• Explore
• Prepare
Results
• Decisions
• Reports
• Communication
15
Size
Y
XZ
Variable Values
A 1, 2, 3, 4
B Low, High, Medium
C 2018-03-27
2018-03-28
2018-03-29
1 2 3 4
Position
Shape
Color
Size
“It seems that perfection is attained not when there is nothing more to add,
but when there is nothing more to remove..”
Antoine de Saint Exupéry, Terre des Hommes (1939)
Visual Elements
Visual encoding is the process
of transforming data into a
visual element to be displayed
in any kind of visualization.
Data
Events are
positioned
according to
their original
time sequence.
Shape size reflects the event relative volume
Groups (Firms, Symbols, Other Classifiers) are
displayed in different colors.
Event volume
can also be
indicated by its
position.
17
“The First Rule of Data Visualization is that
”
“The Second Rule of Data Visualization is
that you stay true to the data”
“If you don't know where you want to go, then it doesn't matter which path you take.”
The Cheshire Cat, Alice in Wonderland
18
Purpose
• Theory
• Question
• Story
• Exploratory vs Explanatory
Emphasis on Data
• Events over Time
• Relationships
• Patterns
Form follows Functions
• Simplicity
• Meaning
• Context
19
We look across market data where we
can see hot spots and outliers.
Interesting events can be visualized with
additional details and context.
Exploratory
• Patterns
• Trends
• Outliers
Explanatory
• Specific Violations
• Surveillance Oriented
• Context
20
Firms
3,700
Brokers
634,000
12
Markets/
Exchanges
events per
day
100 billion
Up to
Production
&
Experimentation
Perception
&
Interaction
Volume
&
Dimensionality
Scalable
Blueprints
Filtering
Sampling
Aggregation
Data Prep
Navigation
On boarding
"The problem is not the problem.
The problem is your attitude about the problem.“
Jack Sparrow, Pirates of the Caribbean
21
22
Parallel Coordinates
Multi-dimensional
Distribution and concentration of
features
Feature selection and order
Raw and scaled values
Feature distribution “shapes”
Parallel coordinates chart
http://www.math.tau.ac.il/~aiisreal/
23
Horizon Graphs
Time series
comparison
Small multiples
Space efficient
Patterns over time
Hot Spot detection
Horizon graph
panopticon.com
24
Multi-dimensional
Network and Relationships
Added elements for visual
encoding
* Experimental in SuRF
Hive plot
hiveplot.com
25
Multiple contexts
Space efficient
Incremental display
* Experimental in SuRF
https://www.usgs.gov/media/images/gis-data-layers-visualization
Q
A
26

Más contenido relacionado

La actualidad más candente

The Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data ManagementThe Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data Management
mark madsen
 
Big Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and RoadmapBig Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and Roadmap
Srinath Perera
 

La actualidad más candente (20)

Data Architecture: OMG It’s Made of People
Data Architecture: OMG It’s Made of PeopleData Architecture: OMG It’s Made of People
Data Architecture: OMG It’s Made of People
 
The Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data ManagementThe Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data Management
 
Importance of data analytics for business
Importance of data analytics for businessImportance of data analytics for business
Importance of data analytics for business
 
Big Data Analytics - GTech Seminar
Big Data Analytics - GTech SeminarBig Data Analytics - GTech Seminar
Big Data Analytics - GTech Seminar
 
Big data analytics in banking sector
Big data analytics in banking sectorBig data analytics in banking sector
Big data analytics in banking sector
 
Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)
 
Operationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the EnterpriseOperationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the Enterprise
 
Machine Learning in Big Data
Machine Learning in Big DataMachine Learning in Big Data
Machine Learning in Big Data
 
Analytics 3.0 Measurable business impact from analytics & big data
Analytics 3.0 Measurable business impact from analytics & big dataAnalytics 3.0 Measurable business impact from analytics & big data
Analytics 3.0 Measurable business impact from analytics & big data
 
Uutta otetta analytiikkaan
Uutta otetta analytiikkaanUutta otetta analytiikkaan
Uutta otetta analytiikkaan
 
Smart Data Webinar: Transforming Industries with Artificial Intelligence (AI)...
Smart Data Webinar: Transforming Industries with Artificial Intelligence (AI)...Smart Data Webinar: Transforming Industries with Artificial Intelligence (AI)...
Smart Data Webinar: Transforming Industries with Artificial Intelligence (AI)...
 
Building Data Science Teams
Building Data Science TeamsBuilding Data Science Teams
Building Data Science Teams
 
Data science team, a practice to setup
Data science team, a practice to setupData science team, a practice to setup
Data science team, a practice to setup
 
Top 10 BI Trends for 2013
Top 10 BI Trends for 2013Top 10 BI Trends for 2013
Top 10 BI Trends for 2013
 
Walmart Big Data Expo
Walmart Big Data ExpoWalmart Big Data Expo
Walmart Big Data Expo
 
CTO Radshow Hamburg17 - Keynote - The CxO responsibilities in Big Data and AI...
CTO Radshow Hamburg17 - Keynote - The CxO responsibilities in Big Data and AI...CTO Radshow Hamburg17 - Keynote - The CxO responsibilities in Big Data and AI...
CTO Radshow Hamburg17 - Keynote - The CxO responsibilities in Big Data and AI...
 
How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku) How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
How to Build a Successful Data Team - Florian Douetteau (@Dataiku)
 
Big Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and RoadmapBig Data Analytics Strategy and Roadmap
Big Data Analytics Strategy and Roadmap
 
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
Data Science Salon: Quit Wasting Time – Case Studies in Production Machine Le...
 
Microsoft jeroen ter heerdt
Microsoft jeroen ter heerdtMicrosoft jeroen ter heerdt
Microsoft jeroen ter heerdt
 

Similar a Strata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud

Similar a Strata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud (20)

빅데이터윈윈 컨퍼런스_데이터시각화자료
빅데이터윈윈 컨퍼런스_데이터시각화자료빅데이터윈윈 컨퍼런스_데이터시각화자료
빅데이터윈윈 컨퍼런스_데이터시각화자료
 
Visual Analytics in Big Data
Visual Analytics in Big DataVisual Analytics in Big Data
Visual Analytics in Big Data
 
Data fluency for the 21st century
Data fluency for the 21st centuryData fluency for the 21st century
Data fluency for the 21st century
 
GraphTour London 2020 - Graphs for AI, Amy Hodler
GraphTour London 2020  - Graphs for AI, Amy HodlerGraphTour London 2020  - Graphs for AI, Amy Hodler
GraphTour London 2020 - Graphs for AI, Amy Hodler
 
Guidelines for data visualisation: eye vegetables and eye candy
Guidelines for data visualisation: eye vegetables and eye candyGuidelines for data visualisation: eye vegetables and eye candy
Guidelines for data visualisation: eye vegetables and eye candy
 
01-introduction.ppt the paper that you can unless you want to join me because...
01-introduction.ppt the paper that you can unless you want to join me because...01-introduction.ppt the paper that you can unless you want to join me because...
01-introduction.ppt the paper that you can unless you want to join me because...
 
Technical Paper Presentation on data analytics.pptx
Technical Paper Presentation on data analytics.pptxTechnical Paper Presentation on data analytics.pptx
Technical Paper Presentation on data analytics.pptx
 
Data Viz - telling stories with data
Data Viz - telling stories with dataData Viz - telling stories with data
Data Viz - telling stories with data
 
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria? Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
 
Visualizing Healthcare Data: Information Design Best Practices (eHealth 2012 ...
Visualizing Healthcare Data: Information Design Best Practices (eHealth 2012 ...Visualizing Healthcare Data: Information Design Best Practices (eHealth 2012 ...
Visualizing Healthcare Data: Information Design Best Practices (eHealth 2012 ...
 
BIDM Session 01.pdf
BIDM Session 01.pdfBIDM Session 01.pdf
BIDM Session 01.pdf
 
Data visualisation
Data visualisationData visualisation
Data visualisation
 
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-shareBigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
BigData Visualization and Usecase@TDGA-Stelligence-11july2019-share
 
Emcien overview v6 01282013
Emcien overview v6 01282013Emcien overview v6 01282013
Emcien overview v6 01282013
 
AMES 2016 - The Human Side of Analytics
AMES 2016 - The Human Side of AnalyticsAMES 2016 - The Human Side of Analytics
AMES 2016 - The Human Side of Analytics
 
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
Bigger and Better: Employing a Holistic Strategy for Big Data toward a Strong...
 
Transforming a Business Through Analytics
Transforming a Business Through AnalyticsTransforming a Business Through Analytics
Transforming a Business Through Analytics
 
Nagios Conference 2013 - Andy Brist - Data Visualizations and Nagios XI
Nagios Conference 2013 - Andy Brist - Data Visualizations and Nagios XINagios Conference 2013 - Andy Brist - Data Visualizations and Nagios XI
Nagios Conference 2013 - Andy Brist - Data Visualizations and Nagios XI
 
Data science intro deck
Data science intro deckData science intro deck
Data science intro deck
 
Data Visualization dataviz superpower
Data Visualization dataviz superpowerData Visualization dataviz superpower
Data Visualization dataviz superpower
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Strata Data Conference 2019 : Scaling Visualization for Big Data in the Cloud

  • 1. Jaipaul Agonus & Daniel Monteiro, FINRA Technology SCALING VISUALIZATION FOR BIG DATA AND ANALYTICS IN THE CLOUD
  • 4. 3 of storageevents per day 25+pb 100s of nodes and edges of surveillance programs trillions100 billion 5000+ running instances 150+ applications Up to
  • 5. Session Takeaways • How FINRA Leverages AWS Infrastructure • Scaling Techniques for Cloud Resources • Data Visualization Principles • Challenges and Strategies in Big Data Visualization • Data Visualization in Market Surveillance 4
  • 6.
  • 7.
  • 8. 7 Surveillance Analyst Needs Interactive Data Access • Drill down into data by using hierarchical datasets models • Export datasets to excel using custom formatting. Visual Analysis • Highlight interesting events and outliers • Support for contextual visualization Review and Feedback • Workflow for reviewing datasets • Support for adding feedback “tags” and comments on datasets
  • 9.
  • 10.
  • 11.
  • 12.
  • 13. Visual display of data plays a fundamental role in articulating ideas and knowledge. The ability to “see” the data can enhance and transform our perception of it. Scaling data visualization is not only about displaying more pixels, but the right ones, in the right context. “15000 Galaxies In One Image.” Image credit: NASA, esa, p. oesch of the university of Geneva, and M. Montes of the university of New South Wales. 12
  • 14. 13 Anaximander ( ~500 BC) John Mansley Robinson, An Introduction to Early Greek Philosophy, Houghton and Mifflin, 1968. Mercator (1570) Atlas of Europe, British Library Google maps (2019) google.com “A map does not just chart, it unlocks and formulates meaning; it forms bridges between here and there, between disparate ideas that we did not know were previously connected.” Reif Larsen, The Selected Works of T.S. Spivet
  • 15. 14 “And what is the use of a book,” thought Alice, “without pictures or conversations?” Alice in Wonderland Insights through visually apparent patterns and trends Problem • Question • Theory • Goal Model • Algorithms • Experimentation • Validation Data • Collect • Explore • Prepare Results • Decisions • Reports • Communication
  • 16. 15 Size Y XZ Variable Values A 1, 2, 3, 4 B Low, High, Medium C 2018-03-27 2018-03-28 2018-03-29 1 2 3 4 Position Shape Color Size “It seems that perfection is attained not when there is nothing more to add, but when there is nothing more to remove..” Antoine de Saint Exupéry, Terre des Hommes (1939) Visual Elements Visual encoding is the process of transforming data into a visual element to be displayed in any kind of visualization. Data
  • 17. Events are positioned according to their original time sequence. Shape size reflects the event relative volume Groups (Firms, Symbols, Other Classifiers) are displayed in different colors. Event volume can also be indicated by its position.
  • 18. 17 “The First Rule of Data Visualization is that ” “The Second Rule of Data Visualization is that you stay true to the data”
  • 19. “If you don't know where you want to go, then it doesn't matter which path you take.” The Cheshire Cat, Alice in Wonderland 18 Purpose • Theory • Question • Story • Exploratory vs Explanatory Emphasis on Data • Events over Time • Relationships • Patterns Form follows Functions • Simplicity • Meaning • Context
  • 20. 19 We look across market data where we can see hot spots and outliers. Interesting events can be visualized with additional details and context. Exploratory • Patterns • Trends • Outliers Explanatory • Specific Violations • Surveillance Oriented • Context
  • 21. 20 Firms 3,700 Brokers 634,000 12 Markets/ Exchanges events per day 100 billion Up to Production & Experimentation Perception & Interaction Volume & Dimensionality Scalable Blueprints Filtering Sampling Aggregation Data Prep Navigation On boarding "The problem is not the problem. The problem is your attitude about the problem.“ Jack Sparrow, Pirates of the Caribbean
  • 22. 21
  • 23. 22 Parallel Coordinates Multi-dimensional Distribution and concentration of features Feature selection and order Raw and scaled values Feature distribution “shapes” Parallel coordinates chart http://www.math.tau.ac.il/~aiisreal/
  • 24. 23 Horizon Graphs Time series comparison Small multiples Space efficient Patterns over time Hot Spot detection Horizon graph panopticon.com
  • 25. 24 Multi-dimensional Network and Relationships Added elements for visual encoding * Experimental in SuRF Hive plot hiveplot.com
  • 26. 25 Multiple contexts Space efficient Incremental display * Experimental in SuRF https://www.usgs.gov/media/images/gis-data-layers-visualization

Notas del editor

  1. Excel charts: color schemas