SlideShare una empresa de Scribd logo
1 de 33
Descargar para leer sin conexión
Álvaro Santamaría
Data Scientist – ITRS
@dofideas
Joel Brunger
System Engineer - MapR
@joelbrunger
Lessons Learned with Visualisation
and Machine Learning for Big Data
Make real-time decisions
based on historical wisdom
Jay Krepps – 2013
“The Log: What every software engineer should know about real-time data's unifying abstraction”
The uber-system Lego-like, OS-based
1. Big-data viz is not a matter of “front-end”
2. 𝜅arao 𝜅e visualisations
3. Information extraction
3. Information extraction
3. Information extraction
3. Information extraction
3. Information extraction
GROUP BY country,
timestamp window of 10 minutes
SELECT count(),
average(temperature),
median(temperature),
max(temperature),
...
tdigest(temperature)
3. Information extraction
4. Hold state…
4. Hold state…
5. … and deliver (state).
6. Build your pipeline
J on the Beach
All Data, One Platform, Every Cloud
Limitless Possibilities
What is MapR?
MapR is the industry’s leading data platform for AI and Analytics.
Rendezvous Architecture
MapR is the industry’s leading data platform for AI and Analytics.
Rendezvous Architecture
MapR is the industry’s leading data platform for AI and Analytics.
Rendezvous Architecture
MapR is the industry’s leading data platform for AI and Analytics.
Rendezvous Architecture
MapR is the industry’s leading data platform for AI and Analytics.
Rendezvous Architecture
The Decoy is design to just collect data inputs
Rendezvous Architecture
MapR is the industry’s leading data platform for AI and Analytics.
Rendezvous Architecture
Introducing the Canary
Rendezvous Architecture
MapR is the industry’s leading data platform for AI and Analytics.
Why did ITRS choose MapR for ‘Gateway Hub’
MapR is the industry’s leading data platform for AI and Analytics.
Ø Simplicity (integrated platform)
Ø Real-time
Ø Processing must be performed in the cluster
Ø Enterprise features
MapR enables ITRS ‘Gateway Hub’ to provide the following benefits
MapR is the industry’s leading data platform for AI and Analytics.
Ø Smarter monitoring
Ø Additional features, application and Services
Ø Global Data Fabric
Ø Support ML in real-time
Thank you
JOnTheBeach 2018
Álvaro Santamaría
Data Scientist – ITRS
@dofideas
Joel Brunger
System Engineer - MapR
@joelbrunger
Lessons Learned with Visualisation
and Machine Learning for Big Data

Más contenido relacionado

La actualidad más candente

GIS - Asset Management - Sea Ports
GIS - Asset Management - Sea PortsGIS - Asset Management - Sea Ports
GIS - Asset Management - Sea Ports
Prashanth Kantheti
 
Future of data management
Future of data managementFuture of data management
Future of data management
CTRLS
 

La actualidad más candente (20)

GTC 2015 Highlights
GTC 2015 HighlightsGTC 2015 Highlights
GTC 2015 Highlights
 
From Data to Insights to Action: When Transactions and Analytics Converge
From Data to Insights to Action: When Transactions and Analytics ConvergeFrom Data to Insights to Action: When Transactions and Analytics Converge
From Data to Insights to Action: When Transactions and Analytics Converge
 
10×6 Fintech: un nouveau pilier? Pascal Laffineur, Altran Benelux
10×6 Fintech: un nouveau pilier?  Pascal Laffineur, Altran Benelux10×6 Fintech: un nouveau pilier?  Pascal Laffineur, Altran Benelux
10×6 Fintech: un nouveau pilier? Pascal Laffineur, Altran Benelux
 
Synthesizing Plausible Infrastructure Configurations for Evaluating Edge Comp...
Synthesizing Plausible Infrastructure Configurations for Evaluating Edge Comp...Synthesizing Plausible Infrastructure Configurations for Evaluating Edge Comp...
Synthesizing Plausible Infrastructure Configurations for Evaluating Edge Comp...
 
GIS - Asset Management - Sea Ports
GIS - Asset Management - Sea PortsGIS - Asset Management - Sea Ports
GIS - Asset Management - Sea Ports
 
Big Data Analytics @ Munich Re - VIII. International Istanbul Insurance Confe...
Big Data Analytics @ Munich Re - VIII. International Istanbul Insurance Confe...Big Data Analytics @ Munich Re - VIII. International Istanbul Insurance Confe...
Big Data Analytics @ Munich Re - VIII. International Istanbul Insurance Confe...
 
Future of data management
Future of data managementFuture of data management
Future of data management
 
AR and Big Data: Interoperable Data Repositories for Collaborative Work Envir...
AR and Big Data: Interoperable Data Repositories for Collaborative Work Envir...AR and Big Data: Interoperable Data Repositories for Collaborative Work Envir...
AR and Big Data: Interoperable Data Repositories for Collaborative Work Envir...
 
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoTDigital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
 
Top 5 Data Science Sessions from GTC 2019
Top 5 Data Science Sessions from GTC 2019Top 5 Data Science Sessions from GTC 2019
Top 5 Data Science Sessions from GTC 2019
 
Top 5 Deep Learning and AI Stories - September 14, 2018
Top 5 Deep Learning and AI Stories - September 14, 2018Top 5 Deep Learning and AI Stories - September 14, 2018
Top 5 Deep Learning and AI Stories - September 14, 2018
 
OpenPOWER partner presentation - GTS Data
OpenPOWER partner presentation - GTS DataOpenPOWER partner presentation - GTS Data
OpenPOWER partner presentation - GTS Data
 
Data Con LA 2018 - How the Auto Industry Accelerates ML with Analytics by Aa...
Data Con LA 2018 - How the Auto Industry Accelerates ML with Analytics by  Aa...Data Con LA 2018 - How the Auto Industry Accelerates ML with Analytics by  Aa...
Data Con LA 2018 - How the Auto Industry Accelerates ML with Analytics by Aa...
 
How Vnomics built a "Digital Twin" for Commercial Trucking
How Vnomics built a "Digital Twin" for Commercial TruckingHow Vnomics built a "Digital Twin" for Commercial Trucking
How Vnomics built a "Digital Twin" for Commercial Trucking
 
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...SPARK USE CASE-  Distributed Reinforcement Learning for Electricity Market Bi...
SPARK USE CASE- Distributed Reinforcement Learning for Electricity Market Bi...
 
IMGS Presentation: AGI Belfast - Geo Big 5 - Open Geospatial - Transforming G...
IMGS Presentation: AGI Belfast - Geo Big 5 - Open Geospatial - Transforming G...IMGS Presentation: AGI Belfast - Geo Big 5 - Open Geospatial - Transforming G...
IMGS Presentation: AGI Belfast - Geo Big 5 - Open Geospatial - Transforming G...
 
Expect More from Hadoop
Expect More from Hadoop Expect More from Hadoop
Expect More from Hadoop
 
NVIDIA Corporation Brochure: Who We Are
NVIDIA Corporation Brochure: Who We AreNVIDIA Corporation Brochure: Who We Are
NVIDIA Corporation Brochure: Who We Are
 
Dsdt meetup-january2018
Dsdt meetup-january2018Dsdt meetup-january2018
Dsdt meetup-january2018
 

Similar a Lessons learned building a big data analytics engine, from proprietary to open source

Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...
DataWorks Summit
 
2023 GEOINT Tutorial - Synthetic Data Tools for Computer Vision-Based AI - Re...
2023 GEOINT Tutorial - Synthetic Data Tools for Computer Vision-Based AI - Re...2023 GEOINT Tutorial - Synthetic Data Tools for Computer Vision-Based AI - Re...
2023 GEOINT Tutorial - Synthetic Data Tools for Computer Vision-Based AI - Re...
Chris Andrews
 

Similar a Lessons learned building a big data analytics engine, from proprietary to open source (20)

Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...Designing data pipelines for analytics and machine learning in industrial set...
Designing data pipelines for analytics and machine learning in industrial set...
 
Dell NVIDIA AI Roadshow - South Western Ontario
Dell NVIDIA AI Roadshow - South Western OntarioDell NVIDIA AI Roadshow - South Western Ontario
Dell NVIDIA AI Roadshow - South Western Ontario
 
FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...
FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...
FIWARE Global Summit - Advanced ML/AI Techniques with FIWARE and Connected Io...
 
Artificial Intelligence: Context of application of AI in Chemicals
Artificial Intelligence: Context of application of AI in ChemicalsArtificial Intelligence: Context of application of AI in Chemicals
Artificial Intelligence: Context of application of AI in Chemicals
 
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture SoftElegance Services: Data Science, Data Engineering, Big Data Architecture
SoftElegance Services: Data Science, Data Engineering, Big Data Architecture
 
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
Platform for Big Data Analytics and Visual Analytics: CSIRO use cases. Februa...
 
How to design ai functions to the cloud native infra
How to design ai functions to the cloud native infraHow to design ai functions to the cloud native infra
How to design ai functions to the cloud native infra
 
Industrial IoT and the emergence of Edge Computing Navigating the Technologic...
Industrial IoT and the emergence of Edge Computing Navigating the Technologic...Industrial IoT and the emergence of Edge Computing Navigating the Technologic...
Industrial IoT and the emergence of Edge Computing Navigating the Technologic...
 
Azure and Predix
Azure and PredixAzure and Predix
Azure and Predix
 
SAP Leonardo
SAP LeonardoSAP Leonardo
SAP Leonardo
 
Overcoming the AIoT Obstacles through Smart Component Integration
Overcoming the AIoT Obstacles through Smart Component IntegrationOvercoming the AIoT Obstacles through Smart Component Integration
Overcoming the AIoT Obstacles through Smart Component Integration
 
Arpan pal roboticsensing_sw2015
Arpan pal roboticsensing_sw2015Arpan pal roboticsensing_sw2015
Arpan pal roboticsensing_sw2015
 
Dell AI and HPC University Roadshow
Dell AI and HPC University RoadshowDell AI and HPC University Roadshow
Dell AI and HPC University Roadshow
 
Dell AI Oil and Gas Webinar
Dell AI Oil and Gas WebinarDell AI Oil and Gas Webinar
Dell AI Oil and Gas Webinar
 
AIoT: Intelligence on Microcontroller
AIoT: Intelligence on MicrocontrollerAIoT: Intelligence on Microcontroller
AIoT: Intelligence on Microcontroller
 
Teradata and Cisco integrated journey to IoT and Smart city
Teradata and Cisco integrated journey to IoT and Smart cityTeradata and Cisco integrated journey to IoT and Smart city
Teradata and Cisco integrated journey to IoT and Smart city
 
InfoRepos Academy Introduction v1.1 - IIOT Experiential Learning Program
InfoRepos Academy  Introduction v1.1 - IIOT Experiential Learning ProgramInfoRepos Academy  Introduction v1.1 - IIOT Experiential Learning Program
InfoRepos Academy Introduction v1.1 - IIOT Experiential Learning Program
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
2023 GEOINT Tutorial - Synthetic Data Tools for Computer Vision-Based AI - Re...
2023 GEOINT Tutorial - Synthetic Data Tools for Computer Vision-Based AI - Re...2023 GEOINT Tutorial - Synthetic Data Tools for Computer Vision-Based AI - Re...
2023 GEOINT Tutorial - Synthetic Data Tools for Computer Vision-Based AI - Re...
 
Integrate the AWS Cloud with Responsive Xilinx Machine Learning at the Edge (...
Integrate the AWS Cloud with Responsive Xilinx Machine Learning at the Edge (...Integrate the AWS Cloud with Responsive Xilinx Machine Learning at the Edge (...
Integrate the AWS Cloud with Responsive Xilinx Machine Learning at the Edge (...
 

Más de J On The Beach

Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
J On The Beach
 
Axon Server went RAFTing
Axon Server went RAFTingAxon Server went RAFTing
Axon Server went RAFTing
J On The Beach
 
Madaari : Ordering For The Monkeys
Madaari : Ordering For The MonkeysMadaari : Ordering For The Monkeys
Madaari : Ordering For The Monkeys
J On The Beach
 
Machine Learning: The Bare Math Behind Libraries
Machine Learning: The Bare Math Behind LibrariesMachine Learning: The Bare Math Behind Libraries
Machine Learning: The Bare Math Behind Libraries
J On The Beach
 

Más de J On The Beach (20)

Massively scalable ETL in real world applications: the hard way
Massively scalable ETL in real world applications: the hard wayMassively scalable ETL in real world applications: the hard way
Massively scalable ETL in real world applications: the hard way
 
Big Data On Data You Don’t Have
Big Data On Data You Don’t HaveBig Data On Data You Don’t Have
Big Data On Data You Don’t Have
 
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
Acoustic Time Series in Industry 4.0: Improved Reliability and Cyber-Security...
 
Pushing it to the edge in IoT
Pushing it to the edge in IoTPushing it to the edge in IoT
Pushing it to the edge in IoT
 
Drinking from the firehose, with virtual streams and virtual actors
Drinking from the firehose, with virtual streams and virtual actorsDrinking from the firehose, with virtual streams and virtual actors
Drinking from the firehose, with virtual streams and virtual actors
 
How do we deploy? From Punched cards to Immutable server pattern
How do we deploy? From Punched cards to Immutable server patternHow do we deploy? From Punched cards to Immutable server pattern
How do we deploy? From Punched cards to Immutable server pattern
 
Java, Turbocharged
Java, TurbochargedJava, Turbocharged
Java, Turbocharged
 
When Cloud Native meets the Financial Sector
When Cloud Native meets the Financial SectorWhen Cloud Native meets the Financial Sector
When Cloud Native meets the Financial Sector
 
The big data Universe. Literally.
The big data Universe. Literally.The big data Universe. Literally.
The big data Universe. Literally.
 
Streaming to a New Jakarta EE
Streaming to a New Jakarta EEStreaming to a New Jakarta EE
Streaming to a New Jakarta EE
 
The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...
The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...
The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...
 
Pushing AI to the Client with WebAssembly and Blazor
Pushing AI to the Client with WebAssembly and BlazorPushing AI to the Client with WebAssembly and Blazor
Pushing AI to the Client with WebAssembly and Blazor
 
Axon Server went RAFTing
Axon Server went RAFTingAxon Server went RAFTing
Axon Server went RAFTing
 
The Six Pitfalls of building a Microservices Architecture (and how to avoid t...
The Six Pitfalls of building a Microservices Architecture (and how to avoid t...The Six Pitfalls of building a Microservices Architecture (and how to avoid t...
The Six Pitfalls of building a Microservices Architecture (and how to avoid t...
 
Madaari : Ordering For The Monkeys
Madaari : Ordering For The MonkeysMadaari : Ordering For The Monkeys
Madaari : Ordering For The Monkeys
 
Servers are doomed to fail
Servers are doomed to failServers are doomed to fail
Servers are doomed to fail
 
Interaction Protocols: It's all about good manners
Interaction Protocols: It's all about good mannersInteraction Protocols: It's all about good manners
Interaction Protocols: It's all about good manners
 
A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...
A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...
A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...
 
Leadership at every level
Leadership at every levelLeadership at every level
Leadership at every level
 
Machine Learning: The Bare Math Behind Libraries
Machine Learning: The Bare Math Behind LibrariesMachine Learning: The Bare Math Behind Libraries
Machine Learning: The Bare Math Behind Libraries
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Lessons learned building a big data analytics engine, from proprietary to open source