SlideShare una empresa de Scribd logo
1 de 46
Descargar para leer sin conexión
®
© 2016 MapR Technologies 1®
© 2016 MapR Technologies 1© 2016 MapR Technologies
®
Streaming Goes Mainstream:
Ellen Friedman
12 October 2016 Women in Big Data Meetup #datawomen
Transport, Processing & Architecture
®
© 2016 MapR Technologies 2®
© 2016 MapR Technologies 2
Contact Information
Ellen Friedman
Solutions Consultant, MapR Technologies
Committer Apache Drill & Apache Mahout projects
Author, O’Reilly short books
Email ellenf@apache.org efriedman@maprtech.com
Twitter @Ellen_Friedman #datawomen
®
© 2016 MapR Technologies 3®
© 2016 MapR Technologies 3
Please support women in tech – help build
girls’ dreams of what they can accomplish
© Ellen Friedman 2015
®
© 2016 MapR Technologies 4®
© 2016 MapR Technologies 4
The	
  entire	
  industry	
  is	
  undergoing	
  a	
  
career	
  change	
  
®
© 2016 MapR Technologies 5®
© 2016 MapR Technologies 5
Big Data has caught on
•  Potential value of big data approaches is widely recognized
•  Technologies for distributed storage at low cost are maturing
•  People are looking for operational and analytical solutions in
order to take advantage of large scale data opportunities…
•  Now there’s a new form of revolution based on streaming data
®
© 2016 MapR Technologies 6®
© 2016 MapR Technologies 6
Why stream?
®
© 2016 MapR Technologies 7®
© 2016 MapR Technologies 7
“Our best understanding comes when
our conclusions fit the evidence.
And that is most effectively done
when our analyses fit the way life
happens.”
- Introduction to Apache Flink
Friedman & Tzoumas (O’Reilly Sept 2016)
®
© 2016 MapR Technologies 8®
© 2016 MapR Technologies 8
Life doesn’t happen in batches…
®
© 2016 MapR Technologies 9®
© 2016 MapR Technologies 9
Images © Friedman & Dunning from O’Reilly book A New Look at Anomaly Detection, used with permission
Time Series Data & the IoT
Sensors in airplanes
not only send data to
the ERD (black box)
They also report back
to manufacturers of
“smart parts” such as
turbines found in jet
engines or wind
farms.
®
© 2016 MapR Technologies 10®
© 2016 MapR Technologies 10
Big data project: Maury’s Wind and Currents charts
- Value from big data in
aggregate
-  Crowd sourced
-  But static: not real time
insights
®
© 2016 MapR Technologies 11®
© 2016 MapR Technologies 11
Modern big data navigation: WAZE
•  Uses real-time streaming traffic & road
information shared by 65 million drivers/ month
•  Intended to save fuel and time during commute
•  Partnered with Esri GSI software to help put
data insights to work for cities, states
11 Oct 2016 article in Tech Crunch
http://bit.ly/tech-crunch-waze-esri
•  Time-value of data often is important
“Outsmarting traffic, together”
-WAZE website https://www.waze.com/
®
© 2016 MapR Technologies 12®
© 2016 MapR Technologies 12
Crowd-sourced Traffic
Streaming sensor data + long term maintenance histories !
•  Machine learning model detects anomalous pattern
•  Signals need for maintenance before damage occurs
Image courtesy Mtell; from Real World Hadoop by
Dunning & Friedman ( © 2015) Chap 6
®
© 2016 MapR Technologies 13®
© 2016 MapR Technologies 13
Streaming	
  is	
  mainstream	
  
®
© 2016 MapR Technologies 14®
© 2016 MapR Technologies 14
Web-based Business
A: Real-time insights from
low latency applications
(update a real-time
dashboard)
B: Current status updated in
databases or search
documents (Customer 360)
C: Durable messages for
auditable history (Security
analytics)
Real-time
dashboards
data
Archived Customer 360
database
Security
analytics
A
B
C
Messages
Logs
®
© 2016 MapR Technologies 15®
© 2016 MapR Technologies 15
Web-based Business
A: Real-time insights from
low latency applications
(update a real-time
dashboard)
B: Current status updated in
databases or search
documents (Customer 360)
C: Durable messages for
auditable history (Security
analytics)
Real-time
dashboards
data
Archived Customer 360
database
Security
analytics
A
B
C
Messages
Logs
®
© 2016 MapR Technologies 16®
© 2016 MapR Technologies 16
Streaming data has value beyond
real-time insights
®
© 2016 MapR Technologies 17®
© 2016 MapR Technologies 17
Web-based Business
A: Real-time insights from
low latency applications
(update a real-time
dashboard)
B: Current status updated in
databases or search
documents (Customer 360)
C: Durable messages for
auditable history (Security
analytics)
Real-time
dashboards
data
Archived Customer 360
database
Security
analytics
A
B
C
Messages
Logs
®
© 2016 MapR Technologies 18®
© 2016 MapR Technologies 18
At the heart of an effective
streaming architecture is the
right choice of stream
transport.
®
© 2016 MapR Technologies 19®
© 2016 MapR Technologies 19
Message Stream Transport
Apache Kafka
or
MapR Streams
Others
®
© 2016 MapR Technologies 20®
© 2016 MapR Technologies 20
Key capabilities
Message Transport Technology: Kafka & MapR Streams
●  Highly scalable
●  High throughput, low
latency
●  Decouple multiple
producers & consumers
●  Durable messages with
configurable time to live
●  Geo-distributed replication
(MapR Streams)
Consumer
group
Messages
Producer
Consumer
group
Consumer
group
Producer
®
© 2016 MapR Technologies 21®
© 2016 MapR Technologies 21
Alert: Pre-conceptions can make you miss new ideas
•  It’s hard to order a coffee if you
want mostly milk
•  Example: MapR Streams is part
of the converged data platform
so does not require a separate
cluster for message transport
(as you would with Kafka)
•  Example: Message streams can
support microservices
“Getting Past Pre-conceptions”
http://bit.ly/mapr-blog-ef-17-08
®
© 2016 MapR Technologies 22®
© 2016 MapR Technologies 22
MapR Streams: Topics, Partitions
•  Data is assigned to topics (as in Kafka)
•  Topic can be partitioned for load balancing/ performance (as in Kafka)
•  Topic partition is distributed across the MapR cluster (not restricted to
one node as in Kafka)
–  Makes long-term auditable history practical
Producer
2
Producer
1 Topic 1
Consumer 2
Consumer 1
Consumer 3
Consumer group
®
© 2016 MapR Technologies 23®
© 2016 MapR Technologies 23
Stream-first Architecture: Basis for MicroServices
Stream as the shared “truth” instead of a database
Database as local truth
POS
1..n
Fraud
detector
Last card
use
Updater
Card
analytics
Other
card activity
®
© 2016 MapR Technologies 24®
© 2016 MapR Technologies 24
MapR Streams: Part of MapR Converged Data Platform
Open Source Engines &
Tools
Commercial Engines & Applications
Utility-Grade Platform Services
Dat
a
Processing
Enterprise Storage
MapR-FS MapR-DB MapR Streams
Database Event Streaming
Global Namespace High Availability Data Protection Self-healing Unified Security Real-time Multi-tenancy
Search &
Others
Cloud &
Managed
Services
Custom
Apps
UnifiedManagementand
Monitoring
MapR Converged Data Platform has distributed files, NoSQL DB &
message streams engineered into one technology
®
© 2016 MapR Technologies 25®
© 2016 MapR Technologies 25
Unique to MapR: Manage topics at Stream level
•  Topics are grouped together in Stream (different from Kafka)
•  Policies are set at the Stream level such as time-to-live, ACEs
(controlled access at this level is different than Kafka)
•  Geo-distributed replication at Stream level (different from Kafka)
Stream
Topic 1
Topic 3
Topic 2
®
© 2016 MapR Technologies 26®
© 2016 MapR Technologies 26
MapR Streams:
Geo-distributed replication of
message stream across data centers
®
© 2016 MapR Technologies 27®
© 2016 MapR Technologies 27
Multiple Stakeholders: Container Shipping
Image © Ellen Friedman 2015
Over 20% of world’s
shipping containers pass
through Singapore’s port.
®
© 2016 MapR Technologies 28®
© 2016 MapR Technologies 28
MapR Streams replication across data centers
A: Sensors stream data to on-
board cluster that reports to
onshore cluster while in port
B: MapR Streams geo-replication
sends data to next port before
ship arrives.
C: Real-time insights alert to “high
humidity” in some containers
Singapore
Tokyo
Sydney
Corporate
HQ
A
B
C
Find details on this use case in Chap 7 of book “Streaming Architecture”
Read online here: http://bit.ly/streams-ebook-ch7
®
© 2016 MapR Technologies 29®
© 2016 MapR Technologies 29
MapR Streams: Replication Across Data Centers
What’s the value?
–  Replication across data centers
with preserved offsets (unlike
Kafka)
–  Opens new use cases:
–  Example: Shared inventory, as with
ad-tech use case
Inventory
model
Global
analytics
Database
Local
state
Inventory
model
Local
state
Data center 1 Data center 2
Central data center
®
© 2016 MapR Technologies 30®
© 2016 MapR Technologies 30
What about stream processing?
®
© 2016 MapR Technologies 31®
© 2016 MapR Technologies 31
Several good choices for stream processing
•  You choose the tool you like for processing streaming data
–  MapR ships & supports the full Apache Spark stack including Spark
Streaming
–  Apache Flink has been benchmarked on MapR with extremely good
performance on MapR Streams transport; Flink not yet supported by
MapR
–  Other good options include Apache Apex (think Data Torrent) & Apache
Storm
®
© 2016 MapR Technologies 32®
© 2016 MapR Technologies 32
Overview: Apache Flink Stream Processing
Figure 2-1 from “Introduction to Apache Flink” book, used with permission.
Download free pdf here: http://bit.ly/mapr-intro-flink-book-pdf
Kafka /
MapR Streams
Database
File
Flink
Transport Processing
®
© 2016 MapR Technologies 33®
© 2016 MapR Technologies 33
Overview: Apache Flink
•  Top level Apache project with big international OSS community
•  True stream processing
–  Advantage if SLAs require extremely low latency (real-time)
–  Good fit to continuous events
•  Also works well for batch processing
•  Being used in production (telecom; games)
®
© 2016 MapR Technologies 34®
© 2016 MapR Technologies 34
Flink is BIG in Europe ;-)
®
© 2016 MapR Technologies 35®
© 2016 MapR Technologies 35
Stream Processing: Compare Choices
“Real-time” event-by-event
processing
• Apache Flink
• Apache Apex
• Apache Storm
Not “real-time” processing:
micro-batching
•  Apache Spark Streaming
But latency is just one issue to consider in choosing a stream
processing technology…
®
© 2016 MapR Technologies 36®
© 2016 MapR Technologies 36
Capabilities for Stream Processing Options
Correct
under
stress
Correct
time / window
semanticsEase of use /
expressiveness
Flink
Streaming
High
throughput
Spark Storm
Low
latency
Figure 1-2 from “Introduction to Apache Flink” book, used with permission.
Download free pdf here: http://bit.ly/mapr-intro-flink-book-pdf
®
© 2016 MapR Technologies 37®
© 2016 MapR Technologies 37
Overview: Apache Flink Windowing
A
B
C
Before:
Windows defined by micro-batches
(not Flink)
A
B
C
Gap
Now:
Windows defined gap between activity
(this is Flink)
Figures 3-1 and 3-2 from “Introduction to Apache Flink” book, used with permission.
Download free pdf here: http://bit.ly/mapr-intro-flink-book-pdf
®
© 2016 MapR Technologies 38®
© 2016 MapR Technologies 38
Overview: Apache Flink Event Time
Figure 3-3 from “Introduction to Apache Flink” book,
used with permission.
Processing time Event time
Computation can be based on when
data is processed
OR
When event occurred
In many situations, processing by event
time provides more accurate results.
®
© 2016 MapR Technologies 39®
© 2016 MapR Technologies 39
Overview: Apache Flink Event Time
Stephan Ewen, Apache Flink PMC Committer, explaining event time
processing option for Flink in a Whiteboard Walkthrough video:
http://bit.ly/mapr-whiteboard-walkthrough-flink-event-time
When you analyze data by
event time, you must take
into account that events
may arrive delayed or out of
order.
This is important for use
cases in which you want to
correlate events.
®
© 2016 MapR Technologies 40®
© 2016 MapR Technologies 40
Apache Flink: Useful Characteristics
•  Stateful processing & accuracy under stress: Checkpoints
•  Windowing options are a good fit to the way natural sessions occur
•  Event time option for accurate computation
–  See Whiteboard Walkthrough video by Stephan Ewen (PMC member Apache
Flink) on event time
http://bit.ly/mapr-whiteboard-walkthrough-flink-event-time
•  Savepoints let you reprocess data (bug fixes, updates, etc)
–  See Whiteboard Walkthrough video by Stephan Ewen on Flink savepoints
http://bit.ly/whiteboard-walkthrough-flink-1
®
© 2016 MapR Technologies 41®
© 2016 MapR Technologies 41
Streaming Resources from MapR (thank you)
Free resource from MapR: book on Apache Spark
Download free pdf
courtesy of MapR Technologies
http://bit.ly/mapr-apache-spark-
book-pdf
Or read online:
http://bit.ly/mapr-apache-spark-
ebook
®
© 2016 MapR Technologies 42®
© 2016 MapR Technologies 42
Streaming Resources from MapR (thank you)
Free resource from MapR: book on stream-1st architecture & message
transport
Download free pdf
courtesy of MapR Technologies
http://bit.ly/mapr-streams-ebook
Or read online:
http://bit.ly/mapr-streaming-data-
ebook
®
© 2016 MapR Technologies 43®
© 2016 MapR Technologies 43
Streaming Resources from MapR (thank you)
Free resource from MapR: book on Apache Flink stream processing
Download free pdf
courtesy of MapR Technologies
http://bit.ly/mapr-intro-flink-book-pdf
Or read online: <coming soon>
Ellen Friedman
& Kostas Tzoumas
Introduction
toApacheFlink
Stream Processing for
Real Time and Beyond
New ebook by
Ellen Friedman and
Kostas Tzoumas
In this book you’ll learn:
· What Apache Flink can do
· How it maintains consistency and provides flexibility
· How people are using it, including in production
· Best practices for streaming architectures
Download your copy:
mapr.com/flink-book
®
© 2016 MapR Technologies 44®
© 2016 MapR Technologies 44
Short Books by Ted Dunning & Ellen Friedman
For sale from Amazon or O’Reilly
Free pdf download courtesy of MapR www.mapr.com/ebook
http://bit.ly/ebook-
real-world-hadoop
http://bit.ly/mapr-
tsdb-ebook
http://bit.ly/
ebook-anomaly
http://bit.ly/
recommendation
-ebook
http://bit.ly/mapr-
ebook-sharing-data
®
© 2016 MapR Technologies 45®
© 2016 MapR Technologies 45
Please support women in tech – help build
girls’ dreams of what they can accomplish
© Ellen Friedman 2015
®
© 2016 MapR Technologies 46®
© 2016 MapR Technologies 46
Thank you !

Más contenido relacionado

La actualidad más candente

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscapeMapR Technologies
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformMapR Technologies
 
Advanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataAdvanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataCarol McDonald
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataMapR Technologies
 
Meruvian - Introduction to MapR
Meruvian - Introduction to MapRMeruvian - Introduction to MapR
Meruvian - Introduction to MapRThe World Bank
 
How Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health CareHow Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health CareCarol McDonald
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...MapR Technologies
 
Applying Machine Learning to Live Patient Data
Applying Machine Learning to  Live Patient DataApplying Machine Learning to  Live Patient Data
Applying Machine Learning to Live Patient DataCarol McDonald
 
Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures Carol McDonald
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Mathieu Dumoulin
 
Demystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep LearningDemystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep LearningCarol McDonald
 
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBStructured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBCarol McDonald
 
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Carol McDonald
 
Streaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka APIStreaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka APICarol McDonald
 
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataState of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataMathieu Dumoulin
 
Machine Learning logistics
Machine Learning logisticsMachine Learning logistics
Machine Learning logisticsTed Dunning
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionMapR Technologies
 
Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Carol McDonald
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...Carol McDonald
 

La actualidad más candente (20)

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
 
Advanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming DataAdvanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming Data
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
 
Meruvian - Introduction to MapR
Meruvian - Introduction to MapRMeruvian - Introduction to MapR
Meruvian - Introduction to MapR
 
How Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health CareHow Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health Care
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
 
Applying Machine Learning to Live Patient Data
Applying Machine Learning to  Live Patient DataApplying Machine Learning to  Live Patient Data
Applying Machine Learning to Live Patient Data
 
Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures
 
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
Streaming Architecture to Connect Everything (Including Hybrid Cloud) - Strat...
 
Demystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep LearningDemystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep Learning
 
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBStructured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
 
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
 
Streaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka APIStreaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka API
 
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor DataState of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
 
Machine Learning logistics
Machine Learning logisticsMachine Learning logistics
Machine Learning logistics
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
 
Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
 

Similar a Streaming Goes Mainstream: New Architecture & Emerging Technologies for Stream Transport and Processing

Map r seattle streams meetup oct 2016
Map r seattle streams meetup   oct 2016Map r seattle streams meetup   oct 2016
Map r seattle streams meetup oct 2016Nitin Kumar
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications MapR Technologies
 
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteWhere is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteTed Dunning
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainMapR Technologies
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged ApplicationsMapR Technologies
 
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016Mathieu Dumoulin
 
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionReal World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionCodemotion
 
The Keys to Digital Transformation
The Keys to Digital TransformationThe Keys to Digital Transformation
The Keys to Digital TransformationMapR Technologies
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Tugdual Grall
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
PrEstoCloud : PROACTIVE CLOUD RESOURCES MANAGEMENT AT THE EDGE FOR EFFICIENT ...
PrEstoCloud : PROACTIVE CLOUD RESOURCES MANAGEMENT AT THE EDGE FOR EFFICIENT ...PrEstoCloud : PROACTIVE CLOUD RESOURCES MANAGEMENT AT THE EDGE FOR EFFICIENT ...
PrEstoCloud : PROACTIVE CLOUD RESOURCES MANAGEMENT AT THE EDGE FOR EFFICIENT ...OW2
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Codemotion
 
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions ArchitectHUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions ArchitectSpagoWorld
 
MapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community EditionMapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community EditionMapR Technologies
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Mathieu Dumoulin
 
Spark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating ExampleSpark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating ExampleIan Downard
 
Big Data LDN 2017: How to leverage the cloud for Business Solutions
Big Data LDN 2017: How to leverage the cloud for Business SolutionsBig Data LDN 2017: How to leverage the cloud for Business Solutions
Big Data LDN 2017: How to leverage the cloud for Business SolutionsMatt Stubbs
 
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceHandling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceMapR Technologies
 

Similar a Streaming Goes Mainstream: New Architecture & Emerging Technologies for Stream Transport and Processing (20)

Map r seattle streams meetup oct 2016
Map r seattle streams meetup   oct 2016Map r seattle streams meetup   oct 2016
Map r seattle streams meetup oct 2016
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
 
Streaming in the Extreme
Streaming in the ExtremeStreaming in the Extreme
Streaming in the Extreme
 
Where is Data Going? - RMDC Keynote
Where is Data Going? - RMDC KeynoteWhere is Data Going? - RMDC Keynote
Where is Data Going? - RMDC Keynote
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged Applications
 
CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016CEP - simplified streaming architecture - Strata Singapore 2016
CEP - simplified streaming architecture - Strata Singapore 2016
 
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in ProductionReal World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in Production
 
The Keys to Digital Transformation
The Keys to Digital TransformationThe Keys to Digital Transformation
The Keys to Digital Transformation
 
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
PrEstoCloud : PROACTIVE CLOUD RESOURCES MANAGEMENT AT THE EDGE FOR EFFICIENT ...
PrEstoCloud : PROACTIVE CLOUD RESOURCES MANAGEMENT AT THE EDGE FOR EFFICIENT ...PrEstoCloud : PROACTIVE CLOUD RESOURCES MANAGEMENT AT THE EDGE FOR EFFICIENT ...
PrEstoCloud : PROACTIVE CLOUD RESOURCES MANAGEMENT AT THE EDGE FOR EFFICIENT ...
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
 
Is Spark Replacing Hadoop
Is Spark Replacing HadoopIs Spark Replacing Hadoop
Is Spark Replacing Hadoop
 
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions ArchitectHUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
HUG Italy meet-up with Fabian Wilckens, MapR EMEA Solutions Architect
 
MapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community EditionMapR 5.2: Getting More Value from the MapR Converged Community Edition
MapR 5.2: Getting More Value from the MapR Converged Community Edition
 
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
 
Spark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating ExampleSpark and MapR Streams: A Motivating Example
Spark and MapR Streams: A Motivating Example
 
Big Data LDN 2017: How to leverage the cloud for Business Solutions
Big Data LDN 2017: How to leverage the cloud for Business SolutionsBig Data LDN 2017: How to leverage the cloud for Business Solutions
Big Data LDN 2017: How to leverage the cloud for Business Solutions
 
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceHandling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in Finance
 

Más de MapR Technologies

Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsMapR Technologies
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageMapR Technologies
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data AnalyticsMapR Technologies
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsMapR Technologies
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLMapR Technologies
 
Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0MapR Technologies
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR Technologies
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR Technologies
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataMapR Technologies
 
Insight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital TransformationInsight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital TransformationMapR Technologies
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast DataMapR Technologies
 

Más de MapR Technologies (11)

Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
 
Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data Platform
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big Data
 
Insight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital TransformationInsight Platforms Accelerate Digital Transformation
Insight Platforms Accelerate Digital Transformation
 
Design Patterns for working with Fast Data
Design Patterns for working with Fast DataDesign Patterns for working with Fast Data
Design Patterns for working with Fast Data
 

Último

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Último (20)

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 

Streaming Goes Mainstream: New Architecture & Emerging Technologies for Stream Transport and Processing

  • 1. ® © 2016 MapR Technologies 1® © 2016 MapR Technologies 1© 2016 MapR Technologies ® Streaming Goes Mainstream: Ellen Friedman 12 October 2016 Women in Big Data Meetup #datawomen Transport, Processing & Architecture
  • 2. ® © 2016 MapR Technologies 2® © 2016 MapR Technologies 2 Contact Information Ellen Friedman Solutions Consultant, MapR Technologies Committer Apache Drill & Apache Mahout projects Author, O’Reilly short books Email ellenf@apache.org efriedman@maprtech.com Twitter @Ellen_Friedman #datawomen
  • 3. ® © 2016 MapR Technologies 3® © 2016 MapR Technologies 3 Please support women in tech – help build girls’ dreams of what they can accomplish © Ellen Friedman 2015
  • 4. ® © 2016 MapR Technologies 4® © 2016 MapR Technologies 4 The  entire  industry  is  undergoing  a   career  change  
  • 5. ® © 2016 MapR Technologies 5® © 2016 MapR Technologies 5 Big Data has caught on •  Potential value of big data approaches is widely recognized •  Technologies for distributed storage at low cost are maturing •  People are looking for operational and analytical solutions in order to take advantage of large scale data opportunities… •  Now there’s a new form of revolution based on streaming data
  • 6. ® © 2016 MapR Technologies 6® © 2016 MapR Technologies 6 Why stream?
  • 7. ® © 2016 MapR Technologies 7® © 2016 MapR Technologies 7 “Our best understanding comes when our conclusions fit the evidence. And that is most effectively done when our analyses fit the way life happens.” - Introduction to Apache Flink Friedman & Tzoumas (O’Reilly Sept 2016)
  • 8. ® © 2016 MapR Technologies 8® © 2016 MapR Technologies 8 Life doesn’t happen in batches…
  • 9. ® © 2016 MapR Technologies 9® © 2016 MapR Technologies 9 Images © Friedman & Dunning from O’Reilly book A New Look at Anomaly Detection, used with permission Time Series Data & the IoT Sensors in airplanes not only send data to the ERD (black box) They also report back to manufacturers of “smart parts” such as turbines found in jet engines or wind farms.
  • 10. ® © 2016 MapR Technologies 10® © 2016 MapR Technologies 10 Big data project: Maury’s Wind and Currents charts - Value from big data in aggregate -  Crowd sourced -  But static: not real time insights
  • 11. ® © 2016 MapR Technologies 11® © 2016 MapR Technologies 11 Modern big data navigation: WAZE •  Uses real-time streaming traffic & road information shared by 65 million drivers/ month •  Intended to save fuel and time during commute •  Partnered with Esri GSI software to help put data insights to work for cities, states 11 Oct 2016 article in Tech Crunch http://bit.ly/tech-crunch-waze-esri •  Time-value of data often is important “Outsmarting traffic, together” -WAZE website https://www.waze.com/
  • 12. ® © 2016 MapR Technologies 12® © 2016 MapR Technologies 12 Crowd-sourced Traffic Streaming sensor data + long term maintenance histories ! •  Machine learning model detects anomalous pattern •  Signals need for maintenance before damage occurs Image courtesy Mtell; from Real World Hadoop by Dunning & Friedman ( © 2015) Chap 6
  • 13. ® © 2016 MapR Technologies 13® © 2016 MapR Technologies 13 Streaming  is  mainstream  
  • 14. ® © 2016 MapR Technologies 14® © 2016 MapR Technologies 14 Web-based Business A: Real-time insights from low latency applications (update a real-time dashboard) B: Current status updated in databases or search documents (Customer 360) C: Durable messages for auditable history (Security analytics) Real-time dashboards data Archived Customer 360 database Security analytics A B C Messages Logs
  • 15. ® © 2016 MapR Technologies 15® © 2016 MapR Technologies 15 Web-based Business A: Real-time insights from low latency applications (update a real-time dashboard) B: Current status updated in databases or search documents (Customer 360) C: Durable messages for auditable history (Security analytics) Real-time dashboards data Archived Customer 360 database Security analytics A B C Messages Logs
  • 16. ® © 2016 MapR Technologies 16® © 2016 MapR Technologies 16 Streaming data has value beyond real-time insights
  • 17. ® © 2016 MapR Technologies 17® © 2016 MapR Technologies 17 Web-based Business A: Real-time insights from low latency applications (update a real-time dashboard) B: Current status updated in databases or search documents (Customer 360) C: Durable messages for auditable history (Security analytics) Real-time dashboards data Archived Customer 360 database Security analytics A B C Messages Logs
  • 18. ® © 2016 MapR Technologies 18® © 2016 MapR Technologies 18 At the heart of an effective streaming architecture is the right choice of stream transport.
  • 19. ® © 2016 MapR Technologies 19® © 2016 MapR Technologies 19 Message Stream Transport Apache Kafka or MapR Streams Others
  • 20. ® © 2016 MapR Technologies 20® © 2016 MapR Technologies 20 Key capabilities Message Transport Technology: Kafka & MapR Streams ●  Highly scalable ●  High throughput, low latency ●  Decouple multiple producers & consumers ●  Durable messages with configurable time to live ●  Geo-distributed replication (MapR Streams) Consumer group Messages Producer Consumer group Consumer group Producer
  • 21. ® © 2016 MapR Technologies 21® © 2016 MapR Technologies 21 Alert: Pre-conceptions can make you miss new ideas •  It’s hard to order a coffee if you want mostly milk •  Example: MapR Streams is part of the converged data platform so does not require a separate cluster for message transport (as you would with Kafka) •  Example: Message streams can support microservices “Getting Past Pre-conceptions” http://bit.ly/mapr-blog-ef-17-08
  • 22. ® © 2016 MapR Technologies 22® © 2016 MapR Technologies 22 MapR Streams: Topics, Partitions •  Data is assigned to topics (as in Kafka) •  Topic can be partitioned for load balancing/ performance (as in Kafka) •  Topic partition is distributed across the MapR cluster (not restricted to one node as in Kafka) –  Makes long-term auditable history practical Producer 2 Producer 1 Topic 1 Consumer 2 Consumer 1 Consumer 3 Consumer group
  • 23. ® © 2016 MapR Technologies 23® © 2016 MapR Technologies 23 Stream-first Architecture: Basis for MicroServices Stream as the shared “truth” instead of a database Database as local truth POS 1..n Fraud detector Last card use Updater Card analytics Other card activity
  • 24. ® © 2016 MapR Technologies 24® © 2016 MapR Technologies 24 MapR Streams: Part of MapR Converged Data Platform Open Source Engines & Tools Commercial Engines & Applications Utility-Grade Platform Services Dat a Processing Enterprise Storage MapR-FS MapR-DB MapR Streams Database Event Streaming Global Namespace High Availability Data Protection Self-healing Unified Security Real-time Multi-tenancy Search & Others Cloud & Managed Services Custom Apps UnifiedManagementand Monitoring MapR Converged Data Platform has distributed files, NoSQL DB & message streams engineered into one technology
  • 25. ® © 2016 MapR Technologies 25® © 2016 MapR Technologies 25 Unique to MapR: Manage topics at Stream level •  Topics are grouped together in Stream (different from Kafka) •  Policies are set at the Stream level such as time-to-live, ACEs (controlled access at this level is different than Kafka) •  Geo-distributed replication at Stream level (different from Kafka) Stream Topic 1 Topic 3 Topic 2
  • 26. ® © 2016 MapR Technologies 26® © 2016 MapR Technologies 26 MapR Streams: Geo-distributed replication of message stream across data centers
  • 27. ® © 2016 MapR Technologies 27® © 2016 MapR Technologies 27 Multiple Stakeholders: Container Shipping Image © Ellen Friedman 2015 Over 20% of world’s shipping containers pass through Singapore’s port.
  • 28. ® © 2016 MapR Technologies 28® © 2016 MapR Technologies 28 MapR Streams replication across data centers A: Sensors stream data to on- board cluster that reports to onshore cluster while in port B: MapR Streams geo-replication sends data to next port before ship arrives. C: Real-time insights alert to “high humidity” in some containers Singapore Tokyo Sydney Corporate HQ A B C Find details on this use case in Chap 7 of book “Streaming Architecture” Read online here: http://bit.ly/streams-ebook-ch7
  • 29. ® © 2016 MapR Technologies 29® © 2016 MapR Technologies 29 MapR Streams: Replication Across Data Centers What’s the value? –  Replication across data centers with preserved offsets (unlike Kafka) –  Opens new use cases: –  Example: Shared inventory, as with ad-tech use case Inventory model Global analytics Database Local state Inventory model Local state Data center 1 Data center 2 Central data center
  • 30. ® © 2016 MapR Technologies 30® © 2016 MapR Technologies 30 What about stream processing?
  • 31. ® © 2016 MapR Technologies 31® © 2016 MapR Technologies 31 Several good choices for stream processing •  You choose the tool you like for processing streaming data –  MapR ships & supports the full Apache Spark stack including Spark Streaming –  Apache Flink has been benchmarked on MapR with extremely good performance on MapR Streams transport; Flink not yet supported by MapR –  Other good options include Apache Apex (think Data Torrent) & Apache Storm
  • 32. ® © 2016 MapR Technologies 32® © 2016 MapR Technologies 32 Overview: Apache Flink Stream Processing Figure 2-1 from “Introduction to Apache Flink” book, used with permission. Download free pdf here: http://bit.ly/mapr-intro-flink-book-pdf Kafka / MapR Streams Database File Flink Transport Processing
  • 33. ® © 2016 MapR Technologies 33® © 2016 MapR Technologies 33 Overview: Apache Flink •  Top level Apache project with big international OSS community •  True stream processing –  Advantage if SLAs require extremely low latency (real-time) –  Good fit to continuous events •  Also works well for batch processing •  Being used in production (telecom; games)
  • 34. ® © 2016 MapR Technologies 34® © 2016 MapR Technologies 34 Flink is BIG in Europe ;-)
  • 35. ® © 2016 MapR Technologies 35® © 2016 MapR Technologies 35 Stream Processing: Compare Choices “Real-time” event-by-event processing • Apache Flink • Apache Apex • Apache Storm Not “real-time” processing: micro-batching •  Apache Spark Streaming But latency is just one issue to consider in choosing a stream processing technology…
  • 36. ® © 2016 MapR Technologies 36® © 2016 MapR Technologies 36 Capabilities for Stream Processing Options Correct under stress Correct time / window semanticsEase of use / expressiveness Flink Streaming High throughput Spark Storm Low latency Figure 1-2 from “Introduction to Apache Flink” book, used with permission. Download free pdf here: http://bit.ly/mapr-intro-flink-book-pdf
  • 37. ® © 2016 MapR Technologies 37® © 2016 MapR Technologies 37 Overview: Apache Flink Windowing A B C Before: Windows defined by micro-batches (not Flink) A B C Gap Now: Windows defined gap between activity (this is Flink) Figures 3-1 and 3-2 from “Introduction to Apache Flink” book, used with permission. Download free pdf here: http://bit.ly/mapr-intro-flink-book-pdf
  • 38. ® © 2016 MapR Technologies 38® © 2016 MapR Technologies 38 Overview: Apache Flink Event Time Figure 3-3 from “Introduction to Apache Flink” book, used with permission. Processing time Event time Computation can be based on when data is processed OR When event occurred In many situations, processing by event time provides more accurate results.
  • 39. ® © 2016 MapR Technologies 39® © 2016 MapR Technologies 39 Overview: Apache Flink Event Time Stephan Ewen, Apache Flink PMC Committer, explaining event time processing option for Flink in a Whiteboard Walkthrough video: http://bit.ly/mapr-whiteboard-walkthrough-flink-event-time When you analyze data by event time, you must take into account that events may arrive delayed or out of order. This is important for use cases in which you want to correlate events.
  • 40. ® © 2016 MapR Technologies 40® © 2016 MapR Technologies 40 Apache Flink: Useful Characteristics •  Stateful processing & accuracy under stress: Checkpoints •  Windowing options are a good fit to the way natural sessions occur •  Event time option for accurate computation –  See Whiteboard Walkthrough video by Stephan Ewen (PMC member Apache Flink) on event time http://bit.ly/mapr-whiteboard-walkthrough-flink-event-time •  Savepoints let you reprocess data (bug fixes, updates, etc) –  See Whiteboard Walkthrough video by Stephan Ewen on Flink savepoints http://bit.ly/whiteboard-walkthrough-flink-1
  • 41. ® © 2016 MapR Technologies 41® © 2016 MapR Technologies 41 Streaming Resources from MapR (thank you) Free resource from MapR: book on Apache Spark Download free pdf courtesy of MapR Technologies http://bit.ly/mapr-apache-spark- book-pdf Or read online: http://bit.ly/mapr-apache-spark- ebook
  • 42. ® © 2016 MapR Technologies 42® © 2016 MapR Technologies 42 Streaming Resources from MapR (thank you) Free resource from MapR: book on stream-1st architecture & message transport Download free pdf courtesy of MapR Technologies http://bit.ly/mapr-streams-ebook Or read online: http://bit.ly/mapr-streaming-data- ebook
  • 43. ® © 2016 MapR Technologies 43® © 2016 MapR Technologies 43 Streaming Resources from MapR (thank you) Free resource from MapR: book on Apache Flink stream processing Download free pdf courtesy of MapR Technologies http://bit.ly/mapr-intro-flink-book-pdf Or read online: <coming soon> Ellen Friedman & Kostas Tzoumas Introduction toApacheFlink Stream Processing for Real Time and Beyond New ebook by Ellen Friedman and Kostas Tzoumas In this book you’ll learn: · What Apache Flink can do · How it maintains consistency and provides flexibility · How people are using it, including in production · Best practices for streaming architectures Download your copy: mapr.com/flink-book
  • 44. ® © 2016 MapR Technologies 44® © 2016 MapR Technologies 44 Short Books by Ted Dunning & Ellen Friedman For sale from Amazon or O’Reilly Free pdf download courtesy of MapR www.mapr.com/ebook http://bit.ly/ebook- real-world-hadoop http://bit.ly/mapr- tsdb-ebook http://bit.ly/ ebook-anomaly http://bit.ly/ recommendation -ebook http://bit.ly/mapr- ebook-sharing-data
  • 45. ® © 2016 MapR Technologies 45® © 2016 MapR Technologies 45 Please support women in tech – help build girls’ dreams of what they can accomplish © Ellen Friedman 2015
  • 46. ® © 2016 MapR Technologies 46® © 2016 MapR Technologies 46 Thank you !