SlideShare una empresa de Scribd logo
1 de 44
1©MapR Technologies - Confidential
Expect More from Hadoop
2©MapR Technologies - Confidential
Introducing MapR
MapR offers the
technology leading
distribution for Hadoop
3©MapR Technologies - Confidential
The Industry-Leaders Choose MapR in
the Cloud
Google chose MapR to
provide Hadoop on Google
Compute Engine
Amazon EMR is the largest
Hadoop provider in revenue
and # of clusters
4©MapR Technologies - Confidential
MapR Supports Broad Set of Use Cases
 Log analysis
 HBase
 Customer targeting
 Social media analysis
 Customer Revenue
Analytics
 ETL Offload
 Advertising exchange
analysis and optimization
 Clickstream Analysis
 Quality profiling/field
failure analysis
 Customer
Sentiment
 Network Analytics
 Monitors and measures
behavior of online shoppers
 Fraud Detection
 Channel analytics
 Customer Behavior Analysis
 Brand Monitoring
 Customer targeting
 Viewer Behavioral analytics
 Recommendation Engine
 Family tree connections
 Intrusion detection & prevention
 Forensic analysis
 Global threat
analytics
 Virus analysis
 Patient care
monitoring
Leading Retailer
 Recommendation Engine
 Fraud detection and Prevention
Leading Bank
5©MapR Technologies - Confidential
Introducing Hadoop
Hadoop is deployed because
a) big data
b) fast data
c) rapidly changing data
6©MapR Technologies - Confidential
Introducing Hadoop
Hadoop is deployed because
a) big data
b) fast data
c) rapidly changing data
7©MapR Technologies - Confidential
Introducing Change
Changing data implies
a need for integration
8©MapR Technologies - Confidential
Introducing Change
Changing data implies
a need for integration
If you copy, the data will
change before you finish.
9©MapR Technologies - Confidential
Controlling Change
Changing data implies
a need for stabilization
10©MapR Technologies - Confidential
Controlling Change
Changing data implies
a need for stabilization
Long running analyses must
have stable data
11©MapR Technologies - Confidential
The Story Can Now be Told
Here are three true
stories about how
Hadoop integration
pays off
12©MapR Technologies - Confidential
Story #1
ETL Off-load
13©MapR Technologies - Confidential
The Problem
 Major telecom vendor
 Key step in billing pipeline handled by data warehouse (EDW)
 EDW at maximum capacity
 Multiple rounds of software optimization already done
 Revenue limiting (= career limiting) bottleneck
14©MapR Technologies - Confidential
ETL
CDR billing
records
Billing
reports
Data Warehouse
Customer
bills
Original Flow
15©MapR Technologies - Confidential
ETL
CDR billing
records
Billing
reports
Data Warehouse
Customer
bills
Original Flow
70% of total load
<10% of total code
Import by bulk
load from NFS
16©MapR Technologies - Confidential
ETL
CDR billing
records
Billing
reports
Data Warehouse
Customer
billing
With ETL Offload
Import written
to MapR via NFS
Bulk load via NFS
from MapR
17©MapR Technologies - Confidential
Simplified Analysis – EDW Strategy
 70% of EDW consumed by ETL processing
 EDW direct hardware cost is approximately $30 million CAPEX, 12
million OPEX
 Additional EDW only increases capacity by 50% due to poor
division of labor
18©MapR Technologies - Confidential
Simplified Analysis – MapR Strategy
 Hardware + MapR cost ~ $1.5 million
 ETL replacement development costs ~ $1.5 million
 Result is 3x performance increase
19©MapR Technologies - Confidential
Price Performance
 EDW strategy
– 1.5 x performance
– $30 million
 MapR Strategy
– 3 x performance
– $3 million
 20x cost/performance advantage for MapR strategy
20©MapR Technologies - Confidential
Story #2
Search Abuse
21©MapR Technologies - Confidential
The Problem
 Build a high performance recommendation
– Use all kinds of available data
 Deploy it to production
– Must have efficient deployment
22©MapR Technologies - Confidential
Input Data
 User transactions
– user id, merchant id
– SIC code, amount
 Offer transactions
– user id, offer id
– vendor id, merchant id’s,
– offers, views, accepts
23©MapR Technologies - Confidential
Input Data
 User transactions
– user id, merchant id
– SIC code, amount
 Offer transactions
– user id, offer id
– vendor id, merchant id’s,
– offers, views, accepts
Import data via standard interfaces
from log files, databases, direct
feeds
Find anomalous indicators of
behavior
24©MapR Technologies - Confidential
Search-based Recommendations
 Sample document
– Merchant Id
– Field for text description
– Phone
– Address
– Location
25©MapR Technologies - Confidential
Search-based Recommendations
 Sample “document”
– Merchant Id
– Field for text description
– Phone
– Address
– Location
– Indicator merchant id’s
– Indicator industry (SIC) id’s
– Indicator offers
– Indicator text
– Local top40
26©MapR Technologies - Confidential
Search-based Recommendations
 Sample “document”
– Merchant Id
– Field for text description
– Phone
– Address
– Location
– Indicator merchant id’s
– Indicator industry (SIC) id’s
– Indicator offers
– Indicator text
– Local top40
 User History (query)
– Current location
– Recent merchant descriptions
– Recent merchant id’s
– Recent SIC codes
– Recent accepted offers
– Local top40
27©MapR Technologies - Confidential
SolR
Indexer
SolR
Indexer
Solr
indexing
Cooccurrence
(Mahout)
Item meta-
data
Index
shards
Transactions
Web Views
Email
offers
28©MapR Technologies - Confidential
SolR
Indexer
SolR
Indexer
Solr
indexing
Cooccurrence
(Mahout)
Item meta-
data
Index
shards
Transactions
Web Views
Email
offers
Legacy code runs
directly in map-
reduce framework
29©MapR Technologies - Confidential
SolR
Indexer
SolR
Indexer
Solr
search
Web tier
Item meta-
data
Index
shards
User
history
30©MapR Technologies - Confidential
SolR
Indexer
SolR
Indexer
Solr
search
Web tier
Item meta-
data
Index
shards
User
history
SolrCloud runs
without change
via NFS
31©MapR Technologies - Confidential
Objective Results
 At a very large credit card company
 History is all transactions, all web interaction
 Processing time cut from 20 hours per day to 3
 Recommendation engine load time decreased from 8 hours to 3
minutes
32©MapR Technologies - Confidential
Story #3
Stable Learning
33©MapR Technologies - Confidential
The Theme and Setting
 A humble machine learning expert once lived in a small cubicle
 One day the CEO walked in and said
– Your machine recommended PINK WAFFLES to my wife!!!
– Tell me why it is suddenly doing this
34©MapR Technologies - Confidential
The Theme and Setting
 A humble machine learning expert once lived in a small cubicle
 One day the CEO walked in and said
– Your machine recommended PINK WAFFLES to my wife!!!
– Tell me why it is suddenly doing this
 The machine learning expert could say nothing because he could
not reproduce the conditions that model was trained with
 The CEO was not pleased
35©MapR Technologies - Confidential
Why?
36©MapR Technologies - Confidential
StormKafka
Twitter
Data Logger
Kafka
Cluster
Kafka
Cluster
Kafka
Cluster
Kafka
API
Web Service NAS
Web
Data
Hadoop
Flume
HDFS
Data
Web-
site
37©MapR Technologies - Confidential
StormKafka
Twitter
Data Logger
Kafka
Cluster
Kafka
Cluster
Kafka
Cluster
Kafka
API
Web Service NAS
Web
Data
Hadoop
Flume
HDFS
Data
Data arrives
continuously
Web-
site
Learning steps
can’t be tied to
delayed data
It can be delayed
arbitrarily
38©MapR Technologies - Confidential
The Essence of the Problem
 Coupling data arrival with modeling makes the data chain brittle
– Minor delays in data delivery will break modeling SLA’s
 But if data can arrive late and restate the past then we can’t easily
replicate a model build
 Existing data chains don’t support full bitemporal queries
39©MapR Technologies - Confidential
Twitter
MapR
Data Logger
Web-
site
Snap
Data
Modeling
Model
Model
Model
Model Mirror
Live System
40©MapR Technologies - Confidential
The New Story
 A humble machine learning expert once lived in a small cubicle
 One day the CEO walked in and said
– Your machine recommended PINK WAFFLES to my wife!!!
– Tell me why it is suddenly doing this
41©MapR Technologies - Confidential
The New Story
 A humble machine learning expert once lived in a small cubicle
 One day the CEO walked in and said
– Your machine recommended PINK WAFFLES to my wife!!!
– Tell me why it is suddenly doing this
 The machine learning expert could
– Pull out all previously deployed models
– Could exactly replicate any training run with any version of software
– Could point out that PINK WAFFLES were actually quite stylish
 The CEO was very pleased … he ran off to buy pink waffles
42©MapR Technologies - Confidential
Expect more from
Hadoop
43©MapR Technologies - Confidential
Expect MapR
44©MapR Technologies - Confidential
Contact me!
 tdunning@maprtech.com or tdunning@apache.org
 @ted_dunning
 Come to the MapR booth

Más contenido relacionado

La actualidad más candente

Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBStructured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBCarol McDonald
 
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...Amazon Web Services
 
Stream Processing in 2019 - AWS Summit Sydney
Stream Processing in 2019 - AWS Summit Sydney Stream Processing in 2019 - AWS Summit Sydney
Stream Processing in 2019 - AWS Summit Sydney Amazon Web Services
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingKai Wähner
 
Openshift serverless Solution
Openshift serverless SolutionOpenshift serverless Solution
Openshift serverless SolutionRyan ZhangCheng
 
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes EverythingThe Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes EverythingKai Wähner
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Carol McDonald
 
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...Kai Wähner
 
Transform and Bridge the Digital Disconnect with SAP Solutions
Transform and Bridge the Digital Disconnect with SAP SolutionsTransform and Bridge the Digital Disconnect with SAP Solutions
Transform and Bridge the Digital Disconnect with SAP SolutionsCapgemini
 
Beyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AIBeyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AIDataWorks Summit
 
Learnings from our Journey to Become an Event-Driven Customer Data Platform (...
Learnings from our Journey to Become an Event-Driven Customer Data Platform (...Learnings from our Journey to Become an Event-Driven Customer Data Platform (...
Learnings from our Journey to Become an Event-Driven Customer Data Platform (...confluent
 
Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0MapR Technologies
 
Demystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep LearningDemystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep LearningCarol McDonald
 
When Streaming Becomes Strategic
When Streaming Becomes StrategicWhen Streaming Becomes Strategic
When Streaming Becomes StrategicMapR Technologies
 
Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures Carol McDonald
 
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Carol McDonald
 
Extending the Reach of R to the Enterprise with TERR and Spotfire
Extending the Reach of R to the Enterprise with TERR and SpotfireExtending the Reach of R to the Enterprise with TERR and Spotfire
Extending the Reach of R to the Enterprise with TERR and SpotfireLou Bajuk
 
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Carol McDonald
 
Next Generation Audience Measurement at Spectrum Reach
Next Generation Audience Measurement at Spectrum ReachNext Generation Audience Measurement at Spectrum Reach
Next Generation Audience Measurement at Spectrum ReachTim Case
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Revolution Analytics
 

La actualidad más candente (20)

Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DBStructured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
 
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
Amazon Redshift Update and How Equinox Fitness Clubs Migrated to a Modern Dat...
 
Stream Processing in 2019 - AWS Summit Sydney
Stream Processing in 2019 - AWS Summit Sydney Stream Processing in 2019 - AWS Summit Sydney
Stream Processing in 2019 - AWS Summit Sydney
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and Manufacturing
 
Openshift serverless Solution
Openshift serverless SolutionOpenshift serverless Solution
Openshift serverless Solution
 
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes EverythingThe Rise Of Event Streaming – Why Apache Kafka Changes Everything
The Rise Of Event Streaming – Why Apache Kafka Changes Everything
 
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
 
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
Simplified Machine Learning Architecture with an Event Streaming Platform (Ap...
 
Transform and Bridge the Digital Disconnect with SAP Solutions
Transform and Bridge the Digital Disconnect with SAP SolutionsTransform and Bridge the Digital Disconnect with SAP Solutions
Transform and Bridge the Digital Disconnect with SAP Solutions
 
Beyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AIBeyond Big Data: Data Science and AI
Beyond Big Data: Data Science and AI
 
Learnings from our Journey to Become an Event-Driven Customer Data Platform (...
Learnings from our Journey to Become an Event-Driven Customer Data Platform (...Learnings from our Journey to Become an Event-Driven Customer Data Platform (...
Learnings from our Journey to Become an Event-Driven Customer Data Platform (...
 
Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0
 
Demystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep LearningDemystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep Learning
 
When Streaming Becomes Strategic
When Streaming Becomes StrategicWhen Streaming Becomes Strategic
When Streaming Becomes Strategic
 
Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures
 
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
 
Extending the Reach of R to the Enterprise with TERR and Spotfire
Extending the Reach of R to the Enterprise with TERR and SpotfireExtending the Reach of R to the Enterprise with TERR and Spotfire
Extending the Reach of R to the Enterprise with TERR and Spotfire
 
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
 
Next Generation Audience Measurement at Spectrum Reach
Next Generation Audience Measurement at Spectrum ReachNext Generation Audience Measurement at Spectrum Reach
Next Generation Audience Measurement at Spectrum Reach
 
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...Performance and Scale Options for R with Hadoop: A comparison of potential ar...
Performance and Scale Options for R with Hadoop: A comparison of potential ar...
 

Destacado

Spark Application for Time Series Analysis
Spark Application for Time Series AnalysisSpark Application for Time Series Analysis
Spark Application for Time Series AnalysisMapR Technologies
 
Recommendation Techn
Recommendation TechnRecommendation Techn
Recommendation TechnTed Dunning
 
Practical Machine Learning: Innovations in Recommendation Workshop
Practical Machine Learning:  Innovations in Recommendation WorkshopPractical Machine Learning:  Innovations in Recommendation Workshop
Practical Machine Learning: Innovations in Recommendation WorkshopMapR Technologies
 
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-timeReal-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-timeTed Dunning
 
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)Spark Summit
 
Real Time and Big Data – It’s About Time
Real Time and Big Data – It’s About TimeReal Time and Big Data – It’s About Time
Real Time and Big Data – It’s About TimeMapR Technologies
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged ApplicationsMapR Technologies
 
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...Spark Summit
 
Intro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco VasquezIntro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco VasquezMapR Technologies
 
Drilling into Data with Apache Drill
Drilling into Data with Apache DrillDrilling into Data with Apache Drill
Drilling into Data with Apache DrillMapR Technologies
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksMapR Technologies
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications MapR Technologies
 
Spark DataFrames and ML Pipelines
Spark DataFrames and ML PipelinesSpark DataFrames and ML Pipelines
Spark DataFrames and ML PipelinesDatabricks
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR Technologies
 
MLlib: Spark's Machine Learning Library
MLlib: Spark's Machine Learning LibraryMLlib: Spark's Machine Learning Library
MLlib: Spark's Machine Learning Libraryjeykottalam
 
Combining Machine Learning Frameworks with Apache Spark
Combining Machine Learning Frameworks with Apache SparkCombining Machine Learning Frameworks with Apache Spark
Combining Machine Learning Frameworks with Apache SparkDatabricks
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment Databricks
 
Practical Machine Learning Pipelines with MLlib
Practical Machine Learning Pipelines with MLlibPractical Machine Learning Pipelines with MLlib
Practical Machine Learning Pipelines with MLlibDatabricks
 
Analyzing Log Data With Apache Spark
Analyzing Log Data With Apache SparkAnalyzing Log Data With Apache Spark
Analyzing Log Data With Apache SparkSpark Summit
 
Apache Spark MLlib 2.0 Preview: Data Science and Production
Apache Spark MLlib 2.0 Preview: Data Science and ProductionApache Spark MLlib 2.0 Preview: Data Science and Production
Apache Spark MLlib 2.0 Preview: Data Science and ProductionDatabricks
 

Destacado (20)

Spark Application for Time Series Analysis
Spark Application for Time Series AnalysisSpark Application for Time Series Analysis
Spark Application for Time Series Analysis
 
Recommendation Techn
Recommendation TechnRecommendation Techn
Recommendation Techn
 
Practical Machine Learning: Innovations in Recommendation Workshop
Practical Machine Learning:  Innovations in Recommendation WorkshopPractical Machine Learning:  Innovations in Recommendation Workshop
Practical Machine Learning: Innovations in Recommendation Workshop
 
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-timeReal-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
 
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
 
Real Time and Big Data – It’s About Time
Real Time and Big Data – It’s About TimeReal Time and Big Data – It’s About Time
Real Time and Big Data – It’s About Time
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged Applications
 
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
Building Large Scale Machine Learning Applications with Pipelines-(Evan Spark...
 
Intro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco VasquezIntro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco Vasquez
 
Drilling into Data with Apache Drill
Drilling into Data with Apache DrillDrilling into Data with Apache Drill
Drilling into Data with Apache Drill
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
 
Spark DataFrames and ML Pipelines
Spark DataFrames and ML PipelinesSpark DataFrames and ML Pipelines
Spark DataFrames and ML Pipelines
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data Platform
 
MLlib: Spark's Machine Learning Library
MLlib: Spark's Machine Learning LibraryMLlib: Spark's Machine Learning Library
MLlib: Spark's Machine Learning Library
 
Combining Machine Learning Frameworks with Apache Spark
Combining Machine Learning Frameworks with Apache SparkCombining Machine Learning Frameworks with Apache Spark
Combining Machine Learning Frameworks with Apache Spark
 
Apache Spark Model Deployment
Apache Spark Model Deployment Apache Spark Model Deployment
Apache Spark Model Deployment
 
Practical Machine Learning Pipelines with MLlib
Practical Machine Learning Pipelines with MLlibPractical Machine Learning Pipelines with MLlib
Practical Machine Learning Pipelines with MLlib
 
Analyzing Log Data With Apache Spark
Analyzing Log Data With Apache SparkAnalyzing Log Data With Apache Spark
Analyzing Log Data With Apache Spark
 
Apache Spark MLlib 2.0 Preview: Data Science and Production
Apache Spark MLlib 2.0 Preview: Data Science and ProductionApache Spark MLlib 2.0 Preview: Data Science and Production
Apache Spark MLlib 2.0 Preview: Data Science and Production
 

Similar a Big Data Paris

Powering the "As it Happens" Business
Powering the "As it Happens" BusinessPowering the "As it Happens" Business
Powering the "As it Happens" BusinessMapR Technologies
 
Achieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate DataAchieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate DataInside Analysis
 
Key Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShareKey Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShareMapR Technologies
 
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise WeAreEsynergy
 
The power of hadoop in business
The power of hadoop in businessThe power of hadoop in business
The power of hadoop in businessMapR Technologies
 
Predictive Analytics San Diego
Predictive Analytics San DiegoPredictive Analytics San Diego
Predictive Analytics San DiegoMapR Technologies
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTKiththi Perera
 
Big data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardBig data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardKiththi Perera
 
How Experian increased insights with Hadoop
How Experian increased insights with HadoopHow Experian increased insights with Hadoop
How Experian increased insights with HadoopPrecisely
 
Network and IT Ops Series: Build Production Solutions
Network and IT Ops Series: Build Production Solutions Network and IT Ops Series: Build Production Solutions
Network and IT Ops Series: Build Production Solutions Neo4j
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersDataWorks Summit
 
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdfth1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdfTarekHassan840678
 
Monetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service ProvidersMonetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service ProvidersDataWorks Summit
 
Sap Leonardo - what is it, and why would I want one?
Sap Leonardo - what is it, and why would I want one?Sap Leonardo - what is it, and why would I want one?
Sap Leonardo - what is it, and why would I want one?Tom Raftery
 
Getting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming ArchitecturesGetting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming ArchitecturesSingleStore
 
Multi-Cloud Breaks IT Ops: Best Practices to De-Risk Your Cloud Strategy
Multi-Cloud Breaks IT Ops: Best Practices to De-Risk Your Cloud StrategyMulti-Cloud Breaks IT Ops: Best Practices to De-Risk Your Cloud Strategy
Multi-Cloud Breaks IT Ops: Best Practices to De-Risk Your Cloud StrategyThousandEyes
 
Serhii Kholodniuk: What you need to know, before migrating data platform to G...
Serhii Kholodniuk: What you need to know, before migrating data platform to G...Serhii Kholodniuk: What you need to know, before migrating data platform to G...
Serhii Kholodniuk: What you need to know, before migrating data platform to G...Lviv Startup Club
 
Polyvalent recommendations
Polyvalent recommendationsPolyvalent recommendations
Polyvalent recommendationsTed Dunning
 

Similar a Big Data Paris (20)

Powering the "As it Happens" Business
Powering the "As it Happens" BusinessPowering the "As it Happens" Business
Powering the "As it Happens" Business
 
Achieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate DataAchieving Business Value by Fusing Hadoop and Corporate Data
Achieving Business Value by Fusing Hadoop and Corporate Data
 
Key Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShareKey Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShare
 
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
 
The power of hadoop in business
The power of hadoop in businessThe power of hadoop in business
The power of hadoop in business
 
Predictive Analytics San Diego
Predictive Analytics San DiegoPredictive Analytics San Diego
Predictive Analytics San Diego
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
 
Big data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardBig data solutions on cloud – the way forward
Big data solutions on cloud – the way forward
 
How Experian increased insights with Hadoop
How Experian increased insights with HadoopHow Experian increased insights with Hadoop
How Experian increased insights with Hadoop
 
Network and IT Ops Series: Build Production Solutions
Network and IT Ops Series: Build Production Solutions Network and IT Ops Series: Build Production Solutions
Network and IT Ops Series: Build Production Solutions
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
 
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdfth1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
 
Monetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service ProvidersMonetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service Providers
 
Sap Leonardo - what is it, and why would I want one?
Sap Leonardo - what is it, and why would I want one?Sap Leonardo - what is it, and why would I want one?
Sap Leonardo - what is it, and why would I want one?
 
Polyvalent Recommendations
Polyvalent RecommendationsPolyvalent Recommendations
Polyvalent Recommendations
 
Getting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming ArchitecturesGetting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming Architectures
 
Multi-Cloud Breaks IT Ops: Best Practices to De-Risk Your Cloud Strategy
Multi-Cloud Breaks IT Ops: Best Practices to De-Risk Your Cloud StrategyMulti-Cloud Breaks IT Ops: Best Practices to De-Risk Your Cloud Strategy
Multi-Cloud Breaks IT Ops: Best Practices to De-Risk Your Cloud Strategy
 
Serhii Kholodniuk: What you need to know, before migrating data platform to G...
Serhii Kholodniuk: What you need to know, before migrating data platform to G...Serhii Kholodniuk: What you need to know, before migrating data platform to G...
Serhii Kholodniuk: What you need to know, before migrating data platform to G...
 
Expect More from Hadoop
Expect More from Hadoop Expect More from Hadoop
Expect More from Hadoop
 
Polyvalent recommendations
Polyvalent recommendationsPolyvalent recommendations
Polyvalent recommendations
 

Más de MapR Technologies

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscapeMapR Technologies
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationMapR Technologies
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataMapR Technologies
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureMapR Technologies
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...MapR Technologies
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsMapR Technologies
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMapR Technologies
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsMapR Technologies
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageMapR Technologies
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionMapR Technologies
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformMapR Technologies
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...MapR Technologies
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareMapR Technologies
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsMapR Technologies
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Technologies
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data AnalyticsMapR Technologies
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsMapR Technologies
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLMapR Technologies
 

Más de MapR Technologies (20)

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning Logistics
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
 

Último

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 

Último (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 

Big Data Paris

  • 1. 1©MapR Technologies - Confidential Expect More from Hadoop
  • 2. 2©MapR Technologies - Confidential Introducing MapR MapR offers the technology leading distribution for Hadoop
  • 3. 3©MapR Technologies - Confidential The Industry-Leaders Choose MapR in the Cloud Google chose MapR to provide Hadoop on Google Compute Engine Amazon EMR is the largest Hadoop provider in revenue and # of clusters
  • 4. 4©MapR Technologies - Confidential MapR Supports Broad Set of Use Cases  Log analysis  HBase  Customer targeting  Social media analysis  Customer Revenue Analytics  ETL Offload  Advertising exchange analysis and optimization  Clickstream Analysis  Quality profiling/field failure analysis  Customer Sentiment  Network Analytics  Monitors and measures behavior of online shoppers  Fraud Detection  Channel analytics  Customer Behavior Analysis  Brand Monitoring  Customer targeting  Viewer Behavioral analytics  Recommendation Engine  Family tree connections  Intrusion detection & prevention  Forensic analysis  Global threat analytics  Virus analysis  Patient care monitoring Leading Retailer  Recommendation Engine  Fraud detection and Prevention Leading Bank
  • 5. 5©MapR Technologies - Confidential Introducing Hadoop Hadoop is deployed because a) big data b) fast data c) rapidly changing data
  • 6. 6©MapR Technologies - Confidential Introducing Hadoop Hadoop is deployed because a) big data b) fast data c) rapidly changing data
  • 7. 7©MapR Technologies - Confidential Introducing Change Changing data implies a need for integration
  • 8. 8©MapR Technologies - Confidential Introducing Change Changing data implies a need for integration If you copy, the data will change before you finish.
  • 9. 9©MapR Technologies - Confidential Controlling Change Changing data implies a need for stabilization
  • 10. 10©MapR Technologies - Confidential Controlling Change Changing data implies a need for stabilization Long running analyses must have stable data
  • 11. 11©MapR Technologies - Confidential The Story Can Now be Told Here are three true stories about how Hadoop integration pays off
  • 12. 12©MapR Technologies - Confidential Story #1 ETL Off-load
  • 13. 13©MapR Technologies - Confidential The Problem  Major telecom vendor  Key step in billing pipeline handled by data warehouse (EDW)  EDW at maximum capacity  Multiple rounds of software optimization already done  Revenue limiting (= career limiting) bottleneck
  • 14. 14©MapR Technologies - Confidential ETL CDR billing records Billing reports Data Warehouse Customer bills Original Flow
  • 15. 15©MapR Technologies - Confidential ETL CDR billing records Billing reports Data Warehouse Customer bills Original Flow 70% of total load <10% of total code Import by bulk load from NFS
  • 16. 16©MapR Technologies - Confidential ETL CDR billing records Billing reports Data Warehouse Customer billing With ETL Offload Import written to MapR via NFS Bulk load via NFS from MapR
  • 17. 17©MapR Technologies - Confidential Simplified Analysis – EDW Strategy  70% of EDW consumed by ETL processing  EDW direct hardware cost is approximately $30 million CAPEX, 12 million OPEX  Additional EDW only increases capacity by 50% due to poor division of labor
  • 18. 18©MapR Technologies - Confidential Simplified Analysis – MapR Strategy  Hardware + MapR cost ~ $1.5 million  ETL replacement development costs ~ $1.5 million  Result is 3x performance increase
  • 19. 19©MapR Technologies - Confidential Price Performance  EDW strategy – 1.5 x performance – $30 million  MapR Strategy – 3 x performance – $3 million  20x cost/performance advantage for MapR strategy
  • 20. 20©MapR Technologies - Confidential Story #2 Search Abuse
  • 21. 21©MapR Technologies - Confidential The Problem  Build a high performance recommendation – Use all kinds of available data  Deploy it to production – Must have efficient deployment
  • 22. 22©MapR Technologies - Confidential Input Data  User transactions – user id, merchant id – SIC code, amount  Offer transactions – user id, offer id – vendor id, merchant id’s, – offers, views, accepts
  • 23. 23©MapR Technologies - Confidential Input Data  User transactions – user id, merchant id – SIC code, amount  Offer transactions – user id, offer id – vendor id, merchant id’s, – offers, views, accepts Import data via standard interfaces from log files, databases, direct feeds Find anomalous indicators of behavior
  • 24. 24©MapR Technologies - Confidential Search-based Recommendations  Sample document – Merchant Id – Field for text description – Phone – Address – Location
  • 25. 25©MapR Technologies - Confidential Search-based Recommendations  Sample “document” – Merchant Id – Field for text description – Phone – Address – Location – Indicator merchant id’s – Indicator industry (SIC) id’s – Indicator offers – Indicator text – Local top40
  • 26. 26©MapR Technologies - Confidential Search-based Recommendations  Sample “document” – Merchant Id – Field for text description – Phone – Address – Location – Indicator merchant id’s – Indicator industry (SIC) id’s – Indicator offers – Indicator text – Local top40  User History (query) – Current location – Recent merchant descriptions – Recent merchant id’s – Recent SIC codes – Recent accepted offers – Local top40
  • 27. 27©MapR Technologies - Confidential SolR Indexer SolR Indexer Solr indexing Cooccurrence (Mahout) Item meta- data Index shards Transactions Web Views Email offers
  • 28. 28©MapR Technologies - Confidential SolR Indexer SolR Indexer Solr indexing Cooccurrence (Mahout) Item meta- data Index shards Transactions Web Views Email offers Legacy code runs directly in map- reduce framework
  • 29. 29©MapR Technologies - Confidential SolR Indexer SolR Indexer Solr search Web tier Item meta- data Index shards User history
  • 30. 30©MapR Technologies - Confidential SolR Indexer SolR Indexer Solr search Web tier Item meta- data Index shards User history SolrCloud runs without change via NFS
  • 31. 31©MapR Technologies - Confidential Objective Results  At a very large credit card company  History is all transactions, all web interaction  Processing time cut from 20 hours per day to 3  Recommendation engine load time decreased from 8 hours to 3 minutes
  • 32. 32©MapR Technologies - Confidential Story #3 Stable Learning
  • 33. 33©MapR Technologies - Confidential The Theme and Setting  A humble machine learning expert once lived in a small cubicle  One day the CEO walked in and said – Your machine recommended PINK WAFFLES to my wife!!! – Tell me why it is suddenly doing this
  • 34. 34©MapR Technologies - Confidential The Theme and Setting  A humble machine learning expert once lived in a small cubicle  One day the CEO walked in and said – Your machine recommended PINK WAFFLES to my wife!!! – Tell me why it is suddenly doing this  The machine learning expert could say nothing because he could not reproduce the conditions that model was trained with  The CEO was not pleased
  • 35. 35©MapR Technologies - Confidential Why?
  • 36. 36©MapR Technologies - Confidential StormKafka Twitter Data Logger Kafka Cluster Kafka Cluster Kafka Cluster Kafka API Web Service NAS Web Data Hadoop Flume HDFS Data Web- site
  • 37. 37©MapR Technologies - Confidential StormKafka Twitter Data Logger Kafka Cluster Kafka Cluster Kafka Cluster Kafka API Web Service NAS Web Data Hadoop Flume HDFS Data Data arrives continuously Web- site Learning steps can’t be tied to delayed data It can be delayed arbitrarily
  • 38. 38©MapR Technologies - Confidential The Essence of the Problem  Coupling data arrival with modeling makes the data chain brittle – Minor delays in data delivery will break modeling SLA’s  But if data can arrive late and restate the past then we can’t easily replicate a model build  Existing data chains don’t support full bitemporal queries
  • 39. 39©MapR Technologies - Confidential Twitter MapR Data Logger Web- site Snap Data Modeling Model Model Model Model Mirror Live System
  • 40. 40©MapR Technologies - Confidential The New Story  A humble machine learning expert once lived in a small cubicle  One day the CEO walked in and said – Your machine recommended PINK WAFFLES to my wife!!! – Tell me why it is suddenly doing this
  • 41. 41©MapR Technologies - Confidential The New Story  A humble machine learning expert once lived in a small cubicle  One day the CEO walked in and said – Your machine recommended PINK WAFFLES to my wife!!! – Tell me why it is suddenly doing this  The machine learning expert could – Pull out all previously deployed models – Could exactly replicate any training run with any version of software – Could point out that PINK WAFFLES were actually quite stylish  The CEO was very pleased … he ran off to buy pink waffles
  • 42. 42©MapR Technologies - Confidential Expect more from Hadoop
  • 43. 43©MapR Technologies - Confidential Expect MapR
  • 44. 44©MapR Technologies - Confidential Contact me!  tdunning@maprtech.com or tdunning@apache.org  @ted_dunning  Come to the MapR booth

Notas del editor

  1. MapR has been selected by two of the companies most experienced with MapReduce technology which is a testament to the technology advanges of MapR’s distribution. Amazon through its Elastic MapReduce service (EMR) hosted over 2 million clusters in the past year. Amazon selected MapR to complement EMR as the only commercial Hadoop distribution being offered, sold and supported as a service by Amazon to its customers. MapR was also selected by Google – the pioneer of MapReduce and the company whose white paper on MapReduce inspired the creation of Hadoop – has also selected MapR to make our distribution available on Google Compute Engine. Hadoop in the cloud makes a great deal of sense: the elastic resource allocation that cloud computing is premised on works well for cluster-based data processing infrastructure used on varying analyses and data sets of indeterminate size. MapR has unique features such as mirroring between sites and multi-tenancy support that further enhance cloud deployments
  2. MapR is used today across industries. We have 10 of the Fortune 100 that are using MapR in production. We have leading web 2.0 properties such as leading digital advertising platforms, using MapR.These customers are using MapR in production for a variety of use cases. Examples include one of the largest credit card issuers in the world that has standardized on MapR for fraud and consumer targeting applications.Other examples include a major health care group,national cyber security, and one of the largest retailers in the world. These are all provided by MapR’s complete distribution for Apache Hadoop