Enviar búsqueda
Cargar
Real-time Hadoop: The Ideal Messaging System for Hadoop
•
Descargar como PPTX, PDF
•
0 recomendaciones
•
1,309 vistas
DataWorks Summit/Hadoop Summit
Seguir
Real-time Hadoop: The Ideal Messaging System for Hadoop
Leer menos
Leer más
Tecnología
Denunciar
Compartir
Denunciar
Compartir
1 de 61
Descargar ahora
Recomendados
Spark graphx
Spark graphx
Carol McDonald
Programming in Spark using PySpark
Programming in Spark using PySpark
Mostafa
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
Rds data lake @ Robinhood
Rds data lake @ Robinhood
BalajiVaradarajan13
Introduction to PySpark
Introduction to PySpark
Russell Jurney
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
Databricks
End-to-end Data Pipeline with Apache Spark
End-to-end Data Pipeline with Apache Spark
Databricks
Introduction to apache spark
Introduction to apache spark
Aakashdata
Recomendados
Spark graphx
Spark graphx
Carol McDonald
Programming in Spark using PySpark
Programming in Spark using PySpark
Mostafa
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
Rds data lake @ Robinhood
Rds data lake @ Robinhood
BalajiVaradarajan13
Introduction to PySpark
Introduction to PySpark
Russell Jurney
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
Databricks
End-to-end Data Pipeline with Apache Spark
End-to-end Data Pipeline with Apache Spark
Databricks
Introduction to apache spark
Introduction to apache spark
Aakashdata
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Databricks
Spark SQL
Spark SQL
Joud Khattab
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
Till Rohrmann
A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...
A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...
Databricks
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
Databricks
Apache sqoop with an use case
Apache sqoop with an use case
Davin Abraham
Hadoop Oozie
Hadoop Oozie
Madhur Nawandar
Spark streaming
Spark streaming
Whiteklay
Spark
Spark
Koushik Mondal
Apache Flink internals
Apache Flink internals
Kostas Tzoumas
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Simplilearn
Consolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest Airports
Databricks
Apache Spark PDF
Apache Spark PDF
Naresh Rupareliya
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Simplilearn
Apache Flink Deep Dive
Apache Flink Deep Dive
DataWorks Summit
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
Anton Kirillov
Transformations and actions a visual guide training
Transformations and actions a visual guide training
Spark Summit
Apache Flink and what it is used for
Apache Flink and what it is used for
Aljoscha Krettek
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Sachin Aggarwal
Rocks db state store in structured streaming
Rocks db state store in structured streaming
Balaji Mohanam
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
DataWorks Summit/Hadoop Summit
Producing Spark on YARN for ETL
Producing Spark on YARN for ETL
DataWorks Summit/Hadoop Summit
Más contenido relacionado
La actualidad más candente
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Databricks
Spark SQL
Spark SQL
Joud Khattab
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
Till Rohrmann
A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...
A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...
Databricks
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
Databricks
Apache sqoop with an use case
Apache sqoop with an use case
Davin Abraham
Hadoop Oozie
Hadoop Oozie
Madhur Nawandar
Spark streaming
Spark streaming
Whiteklay
Spark
Spark
Koushik Mondal
Apache Flink internals
Apache Flink internals
Kostas Tzoumas
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Simplilearn
Consolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest Airports
Databricks
Apache Spark PDF
Apache Spark PDF
Naresh Rupareliya
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Simplilearn
Apache Flink Deep Dive
Apache Flink Deep Dive
DataWorks Summit
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
Anton Kirillov
Transformations and actions a visual guide training
Transformations and actions a visual guide training
Spark Summit
Apache Flink and what it is used for
Apache Flink and what it is used for
Aljoscha Krettek
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Sachin Aggarwal
Rocks db state store in structured streaming
Rocks db state store in structured streaming
Balaji Mohanam
La actualidad más candente
(20)
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Designing ETL Pipelines with Structured Streaming and Delta Lake—How to Archi...
Spark SQL
Spark SQL
Introduction to Apache Flink - Fast and reliable big data processing
Introduction to Apache Flink - Fast and reliable big data processing
A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...
A Deep Dive into Stateful Stream Processing in Structured Streaming with Tath...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
Apache sqoop with an use case
Apache sqoop with an use case
Hadoop Oozie
Hadoop Oozie
Spark streaming
Spark streaming
Spark
Spark
Apache Flink internals
Apache Flink internals
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Consolidating MLOps at One of Europe’s Biggest Airports
Consolidating MLOps at One of Europe’s Biggest Airports
Apache Spark PDF
Apache Spark PDF
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Hadoop Training | Hadoop Training For Beginners | Hadoop Architecture | Hadoo...
Apache Flink Deep Dive
Apache Flink Deep Dive
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
Transformations and actions a visual guide training
Transformations and actions a visual guide training
Apache Flink and what it is used for
Apache Flink and what it is used for
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Rocks db state store in structured streaming
Rocks db state store in structured streaming
Destacado
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
DataWorks Summit/Hadoop Summit
Producing Spark on YARN for ETL
Producing Spark on YARN for ETL
DataWorks Summit/Hadoop Summit
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
DataWorks Summit/Hadoop Summit
Building a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe Crobak
Hakka Labs
Messaging Architectures with NoSQL Databases as Message Stores
Messaging Architectures with NoSQL Databases as Message Stores
Srini Penchikala
Hadoop Crash Course Hadoop Summit SJ
Hadoop Crash Course Hadoop Summit SJ
Daniel Madrigal
Autoscaling Spark on AWS EC2 - 11th Spark London meetup
Autoscaling Spark on AWS EC2 - 11th Spark London meetup
Rafal Kwasny
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012
larsgeorge
Scheduling Policies in YARN
Scheduling Policies in YARN
DataWorks Summit/Hadoop Summit
Apache HBase: State of the Union
Apache HBase: State of the Union
DataWorks Summit/Hadoop Summit
Quark Virtualization Engine for Analytics
Quark Virtualization Engine for Analytics
DataWorks Summit/Hadoop Summit
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache Flink
DataWorks Summit/Hadoop Summit
What's new in SQL on Hadoop and Beyond
What's new in SQL on Hadoop and Beyond
DataWorks Summit/Hadoop Summit
Operating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and Improvements
DataWorks Summit/Hadoop Summit
Machine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of Data
DataWorks Summit/Hadoop Summit
Kafka Security
Kafka Security
DataWorks Summit/Hadoop Summit
YARN Federation
YARN Federation
DataWorks Summit/Hadoop Summit
Real-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache Spark
Guido Schmutz
Workload Automation + Hadoop?
Workload Automation + Hadoop?
DataWorks Summit/Hadoop Summit
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4
Chris Nauroth
Destacado
(20)
Apache Hive 2.0: SQL, Speed, Scale
Apache Hive 2.0: SQL, Speed, Scale
Producing Spark on YARN for ETL
Producing Spark on YARN for ETL
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Building a Data Pipeline from Scratch - Joe Crobak
Building a Data Pipeline from Scratch - Joe Crobak
Messaging Architectures with NoSQL Databases as Message Stores
Messaging Architectures with NoSQL Databases as Message Stores
Hadoop Crash Course Hadoop Summit SJ
Hadoop Crash Course Hadoop Summit SJ
Autoscaling Spark on AWS EC2 - 11th Spark London meetup
Autoscaling Spark on AWS EC2 - 11th Spark London meetup
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012
From Batch to Realtime with Hadoop - Berlin Buzzwords - June 2012
Scheduling Policies in YARN
Scheduling Policies in YARN
Apache HBase: State of the Union
Apache HBase: State of the Union
Quark Virtualization Engine for Analytics
Quark Virtualization Engine for Analytics
Streaming in the Wild with Apache Flink
Streaming in the Wild with Apache Flink
What's new in SQL on Hadoop and Beyond
What's new in SQL on Hadoop and Beyond
Operating and Supporting Apache HBase Best Practices and Improvements
Operating and Supporting Apache HBase Best Practices and Improvements
Machine Learning for Any Size of Data, Any Type of Data
Machine Learning for Any Size of Data, Any Type of Data
Kafka Security
Kafka Security
YARN Federation
YARN Federation
Real-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache Spark
Workload Automation + Hadoop?
Workload Automation + Hadoop?
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4
Similar a Real-time Hadoop: The Ideal Messaging System for Hadoop
Keys for Success from Streams to Queries
Keys for Success from Streams to Queries
DataWorks Summit/Hadoop Summit
Real time-hadoop
Real time-hadoop
Ted Dunning
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
Mathieu Dumoulin
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
MapR Technologies
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Mathieu Dumoulin
Streaming in the Extreme
Streaming in the Extreme
Julius Remigio, CBIP
Building HBase Applications - Ted Dunning
Building HBase Applications - Ted Dunning
MapR Technologies
HUG_Ireland_Streaming_Ted_Dunning
HUG_Ireland_Streaming_Ted_Dunning
John Mulhall
Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine Learning
Ted Dunning
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR Technologies
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR Technologies
How the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside Down
Ted Dunning
Dunning time-series-2015
Dunning time-series-2015
Ted Dunning
Dealing with an Upside Down Internet With High Performance Time Series Database
Dealing with an Upside Down Internet With High Performance Time Series Database
DataWorks Summit
Is Spark Replacing Hadoop
Is Spark Replacing Hadoop
MapR Technologies
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Hortonworks
Hadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapR
Data Con LA
Next Generation Enterprise Architecture
Next Generation Enterprise Architecture
MapR Technologies
Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016
Adam Doyle
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Chris Fregly
Similar a Real-time Hadoop: The Ideal Messaging System for Hadoop
(20)
Keys for Success from Streams to Queries
Keys for Success from Streams to Queries
Real time-hadoop
Real time-hadoop
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Streaming in the Extreme
Streaming in the Extreme
Building HBase Applications - Ted Dunning
Building HBase Applications - Ted Dunning
HUG_Ireland_Streaming_Ted_Dunning
HUG_Ireland_Streaming_Ted_Dunning
Streaming Architecture including Rendezvous for Machine Learning
Streaming Architecture including Rendezvous for Machine Learning
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
How the Internet of Things is Turning the Internet Upside Down
How the Internet of Things is Turning the Internet Upside Down
Dunning time-series-2015
Dunning time-series-2015
Dealing with an Upside Down Internet With High Performance Time Series Database
Dealing with an Upside Down Internet With High Performance Time Series Database
Is Spark Replacing Hadoop
Is Spark Replacing Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Hadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapR
Next Generation Enterprise Architecture
Next Generation Enterprise Architecture
Back to School - St. Louis Hadoop Meetup September 2016
Back to School - St. Louis Hadoop Meetup September 2016
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Más de DataWorks Summit/Hadoop Summit
Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
DataWorks Summit/Hadoop Summit
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
DataWorks Summit/Hadoop Summit
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
DataWorks Summit/Hadoop Summit
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
DataWorks Summit/Hadoop Summit
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
DataWorks Summit/Hadoop Summit
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
DataWorks Summit/Hadoop Summit
Hadoop Crash Course
Hadoop Crash Course
DataWorks Summit/Hadoop Summit
Data Science Crash Course
Data Science Crash Course
DataWorks Summit/Hadoop Summit
Apache Spark Crash Course
Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
Dataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
Schema Registry - Set you Data Free
Schema Registry - Set you Data Free
DataWorks Summit/Hadoop Summit
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
DataWorks Summit/Hadoop Summit
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
DataWorks Summit/Hadoop Summit
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
DataWorks Summit/Hadoop Summit
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
HBase in Practice
HBase in Practice
DataWorks Summit/Hadoop Summit
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
DataWorks Summit/Hadoop Summit
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
DataWorks Summit/Hadoop Summit
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
DataWorks Summit/Hadoop Summit
Más de DataWorks Summit/Hadoop Summit
(20)
Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
Hadoop Crash Course
Hadoop Crash Course
Data Science Crash Course
Data Science Crash Course
Apache Spark Crash Course
Apache Spark Crash Course
Dataflow with Apache NiFi
Dataflow with Apache NiFi
Schema Registry - Set you Data Free
Schema Registry - Set you Data Free
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
HBase in Practice
HBase in Practice
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
Último
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
The Digital Insurer
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Orbitshub
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Jeffrey Haguewood
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Edi Saputra
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Martijn de Jong
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
Andrey Devyatkin
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
Sandro Moreira
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
apidays
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
sudhanshuwaghmare1
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Remote DBA Services
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
The Digital Insurer
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
rafiqahmad00786416
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Orbitshub
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Angeliki Cooney
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
debabhi2
presentation ICT roal in 21st century education
presentation ICT roal in 21st century education
jfdjdjcjdnsjd
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
apidays
Último
(20)
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
presentation ICT roal in 21st century education
presentation ICT roal in 21st century education
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Real-time Hadoop: The Ideal Messaging System for Hadoop
1.
© 2016 MapR
Technologies 1© 2014 MapR Technologies
2.
© 2016 MapR
Technologies 2 Contact Information Ted Dunning Chief Applications Architect at MapR Technologies Committer & PMC for Apache’s Drill, Zookeeper & others VP of Incubator at Apache Foundation Email tdunning@apache.org tdunning@maprtech.com Twitter @ted_dunning Hashtags today: #stratahadoop #ojai
3.
© 2016 MapR
Technologies 3 Streaming Architecture by Ted Dunning and Ellen Friedman © 2016 (published by O’Reilly) Free copies at book signing today 3:40PM @ MapR booth http://bit.ly/mapr-ebook-streams
4.
© 2016 MapR
Technologies 4 Goals • Real-time or near-time – Includes situations with deadlines – Also includes situations where delay is simply undesirable – Even includes situations where delay is just fine • Micro-services – Streaming is a convenient idiom for design – Micro-services … you know we wanted it – Service isolation is a key requirement
5.
© 2016 MapR
Technologies 5 Real-time or Near-time? • The real point is flow versus state (see talk later today) • One consequence of flow-based computing is real-time and near-time become relatively easy • Life may be a bitch, but it doesn’t happen in batches!
6.
© 2016 MapR
Technologies 6 Agenda • Background / micro-services • Global requirements • Scale
7.
© 2016 MapR
Technologies 7 A microservice is loosely coupled with bounded context
8.
© 2016 MapR
Technologies 8 How to Couple Services and Break micro-ness • Shared schemas, relational stores • Ad hoc communication between services • Enterprise service busses • Brittle protocols • Poor protocol versioning
9.
© 2016 MapR
Technologies 9 How to Decouple Services • Use self-describing data • Private databases • Infrastructural communication between services • Use modern protocols • Adopt future-proof protocol practices • Use shared storage where necessary due to scale
10.
© 2016 MapR
Technologies 11 What is the Right Structure for Flow Compute? • Traditional message queues? – Message queues are classic answer – Key feature/bug is out-of-order acknowledgement – Many implementations – You pay a huge performance hit for persistence • Kafka-esque Logs? – Logs are like queues, but with ordering – Out of order consumption is possible, acknowledgement not so much – Canonical base implementation is Kafka – Performance plus persistence
11.
© 2016 MapR
Technologies 12 Scenarios Profile Database
12.
© 2016 MapR
Technologies 13 The task ? POS 1 location, t, card # yes/no? POS 2 location, t, card # yes/no?
13.
© 2016 MapR
Technologies 14 Traditional Solution POS 1..n Fraud detector Last card use
14.
© 2016 MapR
Technologies 15 What Happens Next? POS 1..n Fraud detector Last card use POS 1..n Fraud detector POS 1..n Fraud detector
15.
© 2016 MapR
Technologies 16 What Happens Next? POS 1..n Fraud detector Last card use POS 1..n Fraud detector POS 1..n Fraud detector
16.
© 2016 MapR
Technologies 17 How to Get Service Isolation POS 1..n Fraud detector Last card use Updater card activity
17.
© 2016 MapR
Technologies 18 New Uses of Data POS 1..n Fraud detector Last card use Updater Card location history Other card activity
18.
© 2016 MapR
Technologies 19 Scaling Through Isolation POS 1..n Last card use Updater POS 1..n Last card use Updater card activity Fraud detector Fraud detector
19.
© 2016 MapR
Technologies 20 Lessons • De-coupling and isolation are key • Private data stores/tables are important, – but local storage of private data is a bug • Propagate events, not table updates
20.
© 2016 MapR
Technologies 21 Scenarios IoT Data Aggregation
21.
© 2016 MapR
Technologies 22 Basic Situation Each location has many pumps pump data Multiple locations
22.
© 2016 MapR
Technologies 23 What Does a Pump Look Like inlet out let m ot or Temperature Pressure Flow Temperature Pressure Flow Winding temperature Voltage Current
23.
© 2016 MapR
Technologies 24 Basic Situation Each location has many pumps pump data Multiple locations
24.
© 2016 MapR
Technologies 25 pump data pump data pump data pump data Basic Architecture Reflects Business Structure
25.
© 2016 MapR
Technologies 26 Lessons • Data architecture should reflect business structure • Even very modest designs involve multiple data centers • Schemas cannot be frozen in the real world • Security must follow data ownership
26.
© 2016 MapR
Technologies 27 Scenarios Global Data Recovery
27.
© 2016 MapR
Technologies 28 Tokyo Corporate HQ
28.
© 2016 MapR
Technologies 29 Singapore Tokyo Corporate HQ
29.
© 2016 MapR
Technologies 30 Singapore Tokyo Corporate HQ
30.
© 2016 MapR
Technologies 31 Singapore Tokyo Corporate HQ
31.
© 2016 MapR
Technologies 32 Lessons • Arbitrary number of topics important for simplicity + performance • Updates happen in many places • Mobility implies change in replication patterns • Multi-master updates simplify design massively
32.
© 2016 MapR
Technologies 33 Converged Requirements
33.
© 2016 MapR
Technologies 34 What Have We Learned? • Need persistence and performance – Possibly for years and to 100’s of millions t/s • Must have convergence – Need files, tables AND streams – Need volumes, snapshots, mirrors, permissions and … • Must have platform security – Cannot depend on perimeter – Must follow business structure • Must have global scale and scope – Millions of topics for natural designs – Multi-master replication and update
34.
© 2016 MapR
Technologies 35 The Importance of Common API’s • Commonality and interoperability are critical – Compare Hadoop eco-system and the noSQL world • Table stakes – Persistence – Performance – Polymorphism • Major trend so far is to adopt Kafka API – 0.9 API and beyond remove major abstraction leaks – Kafka API supported by all major Hadoop vendors
35.
© 2016 MapR
Technologies 36 What we do
36.
© 2016 MapR
Technologies 37 Evolution of Data Storage Functionality Compatibility Scalability Linux POSIX Over decades of progress, Unix-based systems have set the standard for compatibility and functionality
37.
© 2016 MapR
Technologies 38 Functionality Compatibility Scalability Linux POSIX Hadoop Hadoop achieves much higher scalability by trading away essentially all of this compatibility Evolution of Data Storage
38.
© 2016 MapR
Technologies 39 Evolution of Data Storage Functionality Compatibility Scalability Linux POSIX Hadoop MapR enhanced Apache Hadoop by restoring the compatibility while increasing scalability and performance Functionality Compatibility Scalability POSIX
39.
© 2016 MapR
Technologies 40 Functionality Compatibility Scalability Linux POSIX Hadoop Evolution of Data Storage Adding tables and streams enhances the functionality of the base file system
40.
© 2016 MapR
Technologies 41 http://bit.ly/fastest-big-data
41.
© 2016 MapR
Technologies 42 How we do this with MapR • MapR Streams is a C++ reimplementation of Kafka API – Advantages in predictability, performance, scale – Common security and permissions with entire MapR converged data platform • Semantic extensions – A cluster contains volumes, files, tables … and now streams – Streams contain topics – Can have default stream or can name stream by path name • Core MapR capabilities preserved – Consistent snapshots, mirrors, multi-master replication
42.
© 2016 MapR
Technologies 43 MapR original Innovations • Volumes – Distributed management – Data placement • Read/write random access file system – Allows distributed meta-data – Improved scaling – Enables NFS access • Application-level NIC bonding • Transactionally correct snapshots and mirrors
43.
© 2016 MapR
Technologies 44 MapR's Containers Each container contains Directories & files Data blocks Replicated on servers No need to manage directly Files/directories are sharded into blocks, which are placed into containers on disks Containers are 16- 32 GB segments of disk, placed on nodes
44.
© 2016 MapR
Technologies 45 MapR's Containers Each container has a replication chain Updates are transactional Failures are handled by rearranging replication
45.
© 2016 MapR
Technologies 46 Container locations and replication CLDB N1, N2 N3, N2 N1, N2 N1, N3 N3, N2 N1 N2 N3Container location database (CLDB) keeps track of nodes hosting each container and replication chain order
46.
© 2016 MapR
Technologies 47 MapR Scaling Containers represent 16 - 32GB of data Each can hold up to 1 Billion files and directories 100M containers = ~ 2 Exabytes (a very large cluster) 250 bytes DRAM to cache a container 25GB to cache all containers for 2EB cluster But not necessary, can page to disk Typical large 10PB cluster needs 2GB Container-reports are 100x - 1000x < HDFS block-reports Serve 100x more data-nodes Increase container size to 64G to serve 4EB cluster Map/reduce not affected
47.
© 2016 MapR
Technologies 48 But Wait, There’s More • Directories and files are implemented in terms of B-trees – Key is offset, value is data blob – Internal transactional semantics guarantees safety and consistency – Layout algorithms give very high layout linearization • Tables are implemented in terms of B-trees – Twisted B-tree implementation allows virtues of log-structured merge tree without the compaction delays – Tablet splitting without pausing, integration with file system transactions • Common security and permissions scheme
48.
© 2016 MapR
Technologies 49 Table Tablet Partition Similar to LSM implementations, tables are decomposed by key ranges Distinct from HBase and Level DB, MapR tables used fixed number (greater than 1) of decompositions Very unusually, relative to LSM and cousins, data structures at the leaf are mutable
49.
© 2016 MapR
Technologies 50 Re-use of Proven Technology Partitions are distributed just like file chunks Same replication and transaction technology
50.
© 2016 MapR
Technologies 51 And More … • Streams are implemented in terms of B-trees as well – Topics and consumer offsets are kept in stream, not ZK – Similar splitting technology as MapR DB tables – Consistent permissions, security, data replication • Standard Kafka 0.9 API • Plans to add OJAI for high-level structuring • Performance is very high
51.
© 2016 MapR
Technologies 52 Example Files Table Streams Directories Cluster Volume mount point
52.
© 2016 MapR
Technologies 53 Cluster Volume mount point
53.
© 2016 MapR
Technologies 54 Lessons • API’s matter more than implementations • There is plenty of room to innovate ahead of the community • Posix, HDFS, HBASE all define useful API’s • Kafka 0.9+ does the same
54.
© 2016 MapR
Technologies 55 Call to action: Support the common API’s
55.
© 2016 MapR
Technologies 56 Call to action: Support the Kafka API’s And come by the MapR booth to check out MapR Streams
56.
© 2016 MapR
Technologies 57
57.
© 2016 MapR
Technologies 58 Streaming Architecture by Ted Dunning and Ellen Friedman © 2016 (published by O’Reilly) Free copies at book signing today http://bit.ly/mapr-ebook-streams
58.
© 2016 MapR
Technologies 59 Read online mapr.com/6ebooks-read Download pdfs mapr.com/6ebooks-pdf 6 Free ebooks Streaming Architecture Ted Dunning & Ellen Friedman and MapR Streams Read online mapr.com/6ebooks-read Download pdfs mapr.com/6ebooks-pdf 6 Free ebooks Streaming Architecture Ted Dunning & Ellen Friedman and MapR Streams Read online mapr.com/6ebooks-read Download pdfs mapr.com/6ebooks-pdf 6 Free ebooks Streaming Architecture Ted Dunning & Ellen Friedman and MapR Streams Read online mapr.com/6ebooks-read Download pdfs mapr.com/6ebooks-pdf 6 Free ebooks Streaming Architecture Ted Dunning & Ellen Friedman and MapR Streams Read online mapr.com/6ebooks-read Download pdfs mapr.com/6ebooks-pdf 6 Free ebooks Streaming Architecture Ted Dunning & Ellen Friedman and MapR Streams Read online mapr.com/6ebooks-read Download pdfs mapr.com/6ebooks-pdf 6 Free ebooks Streaming Architecture Ted Dunning & Ellen Friedman and MapR Streams Read online mapr.com/6ebooks-read Download pdfs mapr.com/6ebooks-pdf 6 Free ebooks Streaming Architecture Ted Dunning & Ellen Friedman and MapR Streams Read online mapr.com/6ebooks-read Download pdfs mapr.com/6ebooks-pdf 6 Free ebooks Streaming Architecture Ted Dunning & Ellen Friedman and MapR Streams
59.
© 2016 MapR
Technologies 60 Thank you for coming today!
60.
© 2016 MapR
Technologies 61 …helping you put data technology to work ● Find answers ● Ask technical questions ● Join on-demand training course discussions ● Follow release announcements ● Share and vote on product ideas ● Find Meetup and event listings Connect with fellow Apache Hadoop and Spark professionals community.mapr.com
61.
© 2016 MapR
Technologies 62 Q&A @mapr maprtech tdunning@maprtech.com Engage with us! MapR maprtech mapr-technologies
Descargar ahora