SlideShare una empresa de Scribd logo
1 de 31
© 2014 MapR Technologies 1© 2014 MapR Technologies
© 2014 MapR Technologies 2
MapR Overview
BIG
DATA
BEST
PRODUCT
BUSINESS
IMPACT
Hadoop
Top Ranked
Production
Success
© 2014 MapR Technologies 3© 2014 MapR Technologies
3 Trends
Forcing a revolution in enterprise architecture
© 2014 MapR Technologies 4
Industry Leaders Compete and Win with Data1TREND
More Data Beats Better Algorithms
Collecting interaction data from ecommerce, social media, offline, and call centers
enables a “customer 360 view” and consumer intimacy
Competitive Advantage is Decided by 0.5%
Consumer financial services: 1% improvement in fraud means hundreds of millions of dollars
Advertising and retail: 0.5% improvement in lift means millions of dollars increase in profitability
© 2014 MapR Technologies 5
Big Data is Overwhelming Traditional Systems
• Mission-critical reliability
• Transaction guarantees
• Deep security
• Real-time performance
• Backup and recovery
• Interactive SQL
• Rich analytics
• Workload management
• Data governance
• Backup and recovery
Enterprise
Data
Architecture
2TREND
ENTERPRISE
USERS
OPERATIONAL
SYSTEMS
ANALYTICAL
SYSTEMS
PRODUCTION
REQUIREMENTS
PRODUCTION
REQUIREMENTS
OUTSIDE SOURCES
© 2014 MapR Technologies 6
Hadoop: The Disruptive Technology at the Core of Big Data3TREND
JOB TRENDS FROM INDEED.COM
Jan ‘06 Jan ‘12 Jan ‘14Jan ‘07 Jan ‘08 Jan ‘09 Jan ‘10 Jan ‘11 Jan ‘13
© 2014 MapR Technologies 7© 2014 MapR Technologies
And 3 Realities
© 2014 MapR Technologies 8
OPERATIONAL
SYSTEMS
ANALYTICAL
SYSTEMS
ENTERPRISE
USERS
1REALITY
• Data staging
• Archive
• Data transformation
• Data exploration
• Streaming,
interactions
Hadoop Relieves the Pressure from Enterprise Systems
2 Interoperability
1 Reliability and DR
4
Supports operations
and analytics
3 High performance
Keys for Production Success
© 2014 MapR Technologies 9
Hadoop is Being Used to Drive Small, Rapid Decisions2REALITY
High Arrival Rate Data
• Clickstream
• Social media
• Sensor data, …
Business Impact
• Revenue optimization
• Risk mitigation
• Operational efficiency
© 2014 MapR Technologies 10
Architecture Matters for Success3REALITY
FOUNDATION
© 2014 MapR Technologies 11
FOUNDATION
Architecture Matters for Success3REALITY
Data protection
& security
High performance
Multi-tenancy
Workload
management
Open standards
for integration
NEW APPLICATIONS SLAs TRUSTEDINFORMATION LOWERTCO
© 2014 MapR Technologies 12
World-Record Performance on Cisco UCS
PREVIOUS
RECORD: 1.6 TB
with 2200 nodes
1.65 TBIN 1 MINUTE
298 NODES
NEW MINUTESORT WORLD RECORD
MapR: With a Fraction of the Hardware
Previous Record
Get the most out of your
hardware infrastructure
© 2014 MapR Technologies 13© 2014 MapR Technologies
MapR: Hadoop Real World Examples
© 2014 MapR Technologies 14
Largest Biometric Database
in the World
PEOPLE
20 BILLION
BIOMETRICS
National identification
system in India for all
citizens
Fingerprint and retinal scan
images and citizen data
1 trillion+ ID verifications
per week, geographically
dispersed across 8 data
centers
About 600m “residents”
enrolled
Requires 100ms response
times; zero data loss and
cross-datacenter replication
© 2014 MapR Technologies 15
Helping Farmers: Software and Insurance
• Help farmers protect and improve their farming operations
• Use machine learning to predict weather & other agribusiness elements
• Combine hyper-local weather monitoring, agronomic data modeling, and
high-resolution weather simulations
• Project weather for 2.5 years at every 20x20 plot across the US
• Climatology simulations need to quickly experiment at small scale and
then scale reliably
• MapR Hadoop to analyze >10 trillion data points from 2.5million sensors
• Faster machine learning performance enables more/faster simulations
• MapR M7 enables geospatial database backed by Amazon S3
OBJECTIVES
CHALLENGES
SOLUTION
Lower risk with new insurance products through better data analytics
Business
Impact
“85% of farmer risk is weather-related. MapR has enabled us to provide a class of weather insurance
that was not available before, helping farmers protect their operations.”
IT Director, Climate Corporation
© 2014 MapR Technologies 16
Cisco was able to analyze service sales opportunities in 1/10 the time, at 1/10 the cost,
and generated $40 million in incremental service bookings in the first year.
Cisco: 360° Customer View
Cisco uses integrated customer data to increase revenues
• Create shared view of customer & operations across 75,000 employees
• Increase revenue opportunities with sales partners
• Customer information was siloed in different divisions
• Customer interactions were inconsistent and not satisfying
• Missed opportunities for upselling/cross selling
• Use MapR to collect customer information across touch points
• Integrate billing, support, manufacturing, social media, websites, dial-in data
• Generate new sales leads internally and for partners
OBJECTIVES
CHALLENGES
SOLUTION
Architecture for
Sales Partner Opportunities
Business
Impact
© 2014 MapR Technologies 17
Financial Services: Recommendation Engine & Real-time Targeting
Making personalized real-time offers to credit card customers
• Increase revenue and customer loyalty with real-time personalized offers
• Increases revenue and improves customer experience through real-time targeting
• A more flexible, scalable platform that’s a fraction of the cost of traditional technologies
• Ensures reliability with MapR’s high availability and disaster recovery features
• Many different CRM tools and siloed targeting engines
• Developers and analysts are unable to access all customer data
• Want to increase speed and relevance of recommendations
• MapR M7 centralizes analytics and operational apps on one platform
• Integrates all customer online and offline data into HBase in real-time:
card member spend graph, merchant data, location, and feedback
• Centralized customer data repository provides more accurate insights
• Uses Mahout machine learning to provide real-time personalized offers
OBJECTIVES
CHALLENGES
SOLUTION
Business
Impact
GLOBAL FINANCIAL
SERVICES
CORPORATION
© 2014 MapR Technologies 18
Rubicon Project: Ad Optimization
Rubicon Project runs a real-time automated advertising platform
• Create open ad platform for over 100K global advertising brands and over
500 of the world’s premium publishers
• To keep up with their rapid growth, they needed to move to a
fault-tolerant, high-availability Hadoop production system
• Hadoop had become central to their operations but they were having
problems with instability
• Their 330-node Hadoop cluster processes 1M records/second
• They chose MapR for enterprise features such as high availability, data
protection and recoverability, disaster recovery, redundancy, and support
OBJECTIVES
CHALLENGES
SOLUTION
“Our company cannot run without Hadoop and MapR. We rely on MapR’s self-healing
HA, disaster recovery and advanced monitoring features to conduct 90 billion real-time
auctions on our global transaction platform.” Jan Gelin, VP of Engineering, Rubicon Project
Business
Impact
© 2014 MapR Technologies 19
Operational Apps: Push Messaging Platform
MapR: Enabling the “smartest, most aware, precise, easy-to-use, scalable,
secure and powerful push messaging platform on the planet"
• Enable organizations to build one-on-one brand relationships
• Push messaging and geo-location targeting that
• Support large numbers of customers in a multi-tenant platform
• Target specific consumers in real time with relevant offers
• Increase reliability of push messaging while lowering data center costs
OBJECTIVES
CHALLENGES
SOLUTION
• Increasing engagement and customer loyalty for 100’s of leading brands
• Reduced hardware footprint by 50%
• Consolidated 8 Hadoop clusters into 1 MapR cluster
Business
Impact
• MapR Distribution for Hadoop with Apache HBase for operational workloads
• Data placement control enables efficient cluster resource management
© 2014 MapR Technologies 20© 2014 MapR Technologies
Enterprise Data Hub Case Studies
© 2014 MapR Technologies 21
Data Warehouse Optimization
Improve data services to customers while reducing enterprise architecture costs
• Provide cloud, security, managed services, data center, & comms
• Report on customer usage, profiles, billing, and sales metrics
• Improve service: Measure service quality and repair metrics
• Reduce customer churn – identify and address IP network hotspots
• Cost of ETL & DW storage for growing IP and clickstream data; >3 months
• Reliability & cost of Hadoop alternatives limited ETL & storage offload
• MapR Data Platform for data staging, ETL, and storage at 1/10th the cost
• MapR provided smallest datacenter footprint with best DR solution
• Enterprise-grade: NFS file management, consistent snapshots & mirroring
OBJECTIVES
CHALLENGES
SOLUTION
• Increased scale to handle network IP and clickstream data
• Reduced workload on DW to maintain reporting SLA’s to business
• Unlocked new insights into network usage and customer preferences
Business
Impact
FORTUNE 100
TELCO
© 2014 MapR Technologies 22
Mainframe Offload & Optimization
Free up MIPS with Hadoop to Lower Cost and Modernize Data Architecture
• Reduce costs: defer expensive mainframe upgrades and reduce MIPS
• Maintain business SLA’s
• Open standards: convert gradually to next-gen data architecture (Hadoop)
• Connect and transform unique data formats (EBCDIC vs. ASCII)
• Skills shortage: Hadoop and mainframe (COBOL & JCL)
• Reliability and flexibility of alternate systems
• Syncsort connectivity and data conversions on MapR
• MapR uniquely handle small files without additional ETL steps to meet SLA
• MapR only Hadoop distribution with reliability mainframe customers expect
OBJECTIVES
CHALLENGES
SOLUTION
 Reduce storage costs: Go from $100K/TB to $1K/TB by migrating data to Hadoop
 Use MIPS wisely: Save average of $7K per MIPS by offloading batch jobs to Hadoop
 Deliver powerful new insights: combine mainframe data with big data for deep insights
Business
Impact
© 2014 MapR Technologies 23© 2014 MapR Technologies
Security and Risk Mgmt. Case Studies
© 2014 MapR Technologies 24
Solutionary: Managed Security Services Provider
Threat detection on real-time streaming data via platform as a service (PaaS)
• To address their growing customer base by processing trillions of messages (petabyte)
per year while continuing to provide reliable security services
• To improve data analytics by leveraging newer, more granular unstructured data
sources
”MapR has taken Apache Hadoop to a new level of performance and manageability. It integrates into
our systems seamlessly to help us boost the speed and capacity of data analytics for our clients.”
- Dave Caplinger, Director of Architecture, Solutionary
• Expanding existing database solution to meet demand was cost prohibitive
• The existing technology could not process unstructured data at scale
• Replaced RDBMS with MapR M7 to scale while retaining reliability requirements
• Reduced time needed to investigate security events for relevance and impact
• Improved data analytics, enabling new services and security analytics
• 2x faster performance compared to competing solutions
OBJECTIVES
CHALLENGES
SOLUTION
Business
Impact
Leader in Magic Quadrant
© 2014 MapR Technologies 25
Zions Bank: From SIEM to Fraud Detection
Cost effective security analytics and fraud detection on one platform
• To operationalize big data fraud detection: Fraud Operations and Security Analytics
team at Zions maintains data stores, builds statistical models to detect fraud, and then
uses these models to data mine and evaluate suspicious activity
• (Global bank fraud costs $200B annually)
“We initially got into centralizing all of our data from an information security perspective. We then saw
that we could use this same environment to help with fraud detection”
Michael Fowkes - SVP Fraud Operations and Security Analytics
• Existing technology infrastructure could not scale
• Timeliness of reports degraded over the last several years
• Chose MapR and cut storage costs by 50%
• Gained huge performance advantage – Querying time reduced from 24 hours to 30
min on 1.2 PB of data
• Leverage MapR scale for increased model accuracy and deeper insights
OBJECTIVES
CHALLENGES
SOLUTION
Business
Impact
© 2014 MapR Technologies 26
Cisco: Global Security Intelligence Operations (MSSP)
Operational and analytical security applications on one platform
• To protect customer networks through early-warning intelligence & vulnerability analysis
• To better react to evolving security threats in real-time
• Collect additional telemetry data from customers' firewalls, intrusion prevention systems
• Different analytical teams derived security intelligence in silos and lacked synergy
• Inability to scale with existing infrastructure to a million events per second from nearly
100 different channels over tens of thousands of distributed sensors
OBJECTIVES
CHALLENGES
SOLUTION
Business
Impact
• All analytic teams leverage a common platform leading to operational efficiencies
• Capability to scale - aggregating and analyzing millions of data points in real time
• Update customer networks with new threat footprints within a 2 to 5 minute window
• MapR M7: Central hub for all of the security analytics teams
• Stream, interactive, graph and batch processing on MapR with the flexibility to
perform closed-loop analytics across these functions in real time
• Key Features: Scale, enterprise-grade, operational efficiency and high performance
© 2014 MapR Technologies 27
Cisco SIO Hadoop Stack
SENSOR DATA
FIREWALL
LOGS
INTRUSION
PROTECTION
SYSTEM LOGS
Globally Dispersed
Datacenters
SECURITY
APPLIANCE LOGS
SQL Queries
and
Reporting
Batch
Processing
Graph
Processing
New Threat Footprint
within 2-5 min
Closed-Loop
Operations
Benefits: Unified platform for Analytics
 Low Operational Costs
 Faster Response Times
 Better Algorithms
MapR M7 Distribution for Hadoop
1 million events/sec. Over 100 channels
Spark
Streamin
g
for known threats
& aggregation
Mahout,
MLLib
Shark, Impala
GraphX &
TitanDB
© 2014 MapR Technologies 28
MapR is the Hadoop Technology Leader
BIG DATA
HADOOP
© 2014 MapR Technologies 29
MapR Distribution for Hadoop
MapR Data Platform
(Random Read/Write)
Data HubEnterprise Grade Operational
MapR-FS
(POSIX)
MapR-DB
(High-Performance NoSQL)
Security
YARN
Pig
Cascading
Spark
Batch
Spark
Streaming
Storm*
Streaming
HBase
Solr
NoSQL &
Search
Juju
Provisioning
&
Coordination
Savannah*
Mahout
MLLib
ML, Graph
GraphX
MapReduc
e v1 & v2
APACHE HADOOP AND OSS ECOSYSTEM
EXECUTION ENGINES DATA GOVERNANCE AND OPERATIONS
Workflow
& Data
GovernanceTez*
Accumulo*
Hive
Impala
Shark
Drill*
SQL
Sentry* Oozie ZooKeeperSqoop
Knox* WhirrFalcon*Flume
Data
Integration
& Access
HttpFS
Hue
NFS HDFS API HBase API JSON API
© 2014 MapR Technologies 30
MapR Summary
BIG
DATA
BEST
PRODUCT
BUSINESS
IMPACT
Hadoop
Top Ranked
Production
Success
© 2014 MapR Technologies 31
Q&A
@mapr maprtech
nitin@mapr.com
Engage with us!
MapR
maprtech
mapr-technologies

Más contenido relacionado

La actualidad más candente

Appplications – Driving Expansion In The Cloud
Appplications – Driving Expansion In The CloudAppplications – Driving Expansion In The Cloud
Appplications – Driving Expansion In The CloudNetAppUK
 
Accelerate your business in a data-driven world
Accelerate your business in a data-driven worldAccelerate your business in a data-driven world
Accelerate your business in a data-driven worldNetApp
 
Converged Everything, Converged Infrastructure delivering business value and ...
Converged Everything, Converged Infrastructure delivering business value and ...Converged Everything, Converged Infrastructure delivering business value and ...
Converged Everything, Converged Infrastructure delivering business value and ...NetAppUK
 
Benefits of Transferring Real-Time Data to Hadoop at Scale
Benefits of Transferring Real-Time Data to Hadoop at ScaleBenefits of Transferring Real-Time Data to Hadoop at Scale
Benefits of Transferring Real-Time Data to Hadoop at ScaleHortonworks
 
Transforming Business with Intel and SAP HANA 2
Transforming Business with Intel and SAP HANA 2 Transforming Business with Intel and SAP HANA 2
Transforming Business with Intel and SAP HANA 2 PT Datacomm Diangraha
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Vantara
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformMapR Technologies
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
 
Postgres Vision 2018: How to Consume your Database Platform On-premises
Postgres Vision 2018: How to Consume your Database Platform On-premisesPostgres Vision 2018: How to Consume your Database Platform On-premises
Postgres Vision 2018: How to Consume your Database Platform On-premisesEDB
 
Powering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data SystemsPowering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data SystemsHitachi Vantara
 
Postgres Vision 2018: The Pragmatic Cloud
Postgres Vision 2018:  The Pragmatic CloudPostgres Vision 2018:  The Pragmatic Cloud
Postgres Vision 2018: The Pragmatic CloudEDB
 
10 Reasons to Choose NetApp for EUC/VDI
10 Reasons to Choose NetApp for EUC/VDI10 Reasons to Choose NetApp for EUC/VDI
10 Reasons to Choose NetApp for EUC/VDINetApp
 
Datameer Analytics Solution
Datameer Analytics SolutionDatameer Analytics Solution
Datameer Analytics Solutiontempledf
 
Postgres Vision 2018: Making Modern an Old Legacy System
Postgres Vision 2018: Making Modern an Old Legacy SystemPostgres Vision 2018: Making Modern an Old Legacy System
Postgres Vision 2018: Making Modern an Old Legacy SystemEDB
 
Better Business in a Flash
Better Business in a FlashBetter Business in a Flash
Better Business in a FlashNetApp
 
Postgres Vision 2018: The Changing Role of the DBA in the Cloud
Postgres Vision 2018: The Changing Role of the DBA in the CloudPostgres Vision 2018: The Changing Role of the DBA in the Cloud
Postgres Vision 2018: The Changing Role of the DBA in the CloudEDB
 

La actualidad más candente (20)

Appplications – Driving Expansion In The Cloud
Appplications – Driving Expansion In The CloudAppplications – Driving Expansion In The Cloud
Appplications – Driving Expansion In The Cloud
 
Accelerate your business in a data-driven world
Accelerate your business in a data-driven worldAccelerate your business in a data-driven world
Accelerate your business in a data-driven world
 
Converged Everything, Converged Infrastructure delivering business value and ...
Converged Everything, Converged Infrastructure delivering business value and ...Converged Everything, Converged Infrastructure delivering business value and ...
Converged Everything, Converged Infrastructure delivering business value and ...
 
HPE & SAP Strategic Alliance
HPE & SAP Strategic AllianceHPE & SAP Strategic Alliance
HPE & SAP Strategic Alliance
 
Benefits of Transferring Real-Time Data to Hadoop at Scale
Benefits of Transferring Real-Time Data to Hadoop at ScaleBenefits of Transferring Real-Time Data to Hadoop at Scale
Benefits of Transferring Real-Time Data to Hadoop at Scale
 
Transforming Business with Intel and SAP HANA 2
Transforming Business with Intel and SAP HANA 2 Transforming Business with Intel and SAP HANA 2
Transforming Business with Intel and SAP HANA 2
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop Solution
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
Data Process Systems, connecting everything
Data Process Systems, connecting everythingData Process Systems, connecting everything
Data Process Systems, connecting everything
 
Postgres Vision 2018: How to Consume your Database Platform On-premises
Postgres Vision 2018: How to Consume your Database Platform On-premisesPostgres Vision 2018: How to Consume your Database Platform On-premises
Postgres Vision 2018: How to Consume your Database Platform On-premises
 
Powering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data SystemsPowering the Enterprise Cloud with CSC and Hitachi Data Systems
Powering the Enterprise Cloud with CSC and Hitachi Data Systems
 
Postgres Vision 2018: The Pragmatic Cloud
Postgres Vision 2018:  The Pragmatic CloudPostgres Vision 2018:  The Pragmatic Cloud
Postgres Vision 2018: The Pragmatic Cloud
 
10 Reasons to Choose NetApp for EUC/VDI
10 Reasons to Choose NetApp for EUC/VDI10 Reasons to Choose NetApp for EUC/VDI
10 Reasons to Choose NetApp for EUC/VDI
 
Couchbase
CouchbaseCouchbase
Couchbase
 
Datameer Analytics Solution
Datameer Analytics SolutionDatameer Analytics Solution
Datameer Analytics Solution
 
Postgres Vision 2018: Making Modern an Old Legacy System
Postgres Vision 2018: Making Modern an Old Legacy SystemPostgres Vision 2018: Making Modern an Old Legacy System
Postgres Vision 2018: Making Modern an Old Legacy System
 
Better Business in a Flash
Better Business in a FlashBetter Business in a Flash
Better Business in a Flash
 
Postgres Vision 2018: The Changing Role of the DBA in the Cloud
Postgres Vision 2018: The Changing Role of the DBA in the CloudPostgres Vision 2018: The Changing Role of the DBA in the Cloud
Postgres Vision 2018: The Changing Role of the DBA in the Cloud
 

Destacado

Chapter 04 computer codes
Chapter 04 computer codesChapter 04 computer codes
Chapter 04 computer codesIIUI
 
Hadoop tools with Examples
Hadoop tools with ExamplesHadoop tools with Examples
Hadoop tools with ExamplesJoe McTee
 
Error Detection and Correction
Error Detection and CorrectionError Detection and Correction
Error Detection and CorrectionTechiNerd
 
Ascii and Unicode (Character Codes)
Ascii and Unicode (Character Codes)Ascii and Unicode (Character Codes)
Ascii and Unicode (Character Codes)Project Student
 
Errror Detection and Correction
Errror Detection and CorrectionErrror Detection and Correction
Errror Detection and CorrectionMahesh Kumar Attri
 
Chapter 2 : TEXT
Chapter 2 : TEXTChapter 2 : TEXT
Chapter 2 : TEXTazira96
 
Uses of computer
Uses of computerUses of computer
Uses of computera2zeenice
 
Error Detection And Correction
Error Detection And CorrectionError Detection And Correction
Error Detection And CorrectionRenu Kewalramani
 

Destacado (10)

Chapter 04 computer codes
Chapter 04 computer codesChapter 04 computer codes
Chapter 04 computer codes
 
Hadoop tools with Examples
Hadoop tools with ExamplesHadoop tools with Examples
Hadoop tools with Examples
 
Ascii 03
Ascii 03Ascii 03
Ascii 03
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
Error Detection and Correction
Error Detection and CorrectionError Detection and Correction
Error Detection and Correction
 
Ascii and Unicode (Character Codes)
Ascii and Unicode (Character Codes)Ascii and Unicode (Character Codes)
Ascii and Unicode (Character Codes)
 
Errror Detection and Correction
Errror Detection and CorrectionErrror Detection and Correction
Errror Detection and Correction
 
Chapter 2 : TEXT
Chapter 2 : TEXTChapter 2 : TEXT
Chapter 2 : TEXT
 
Uses of computer
Uses of computerUses of computer
Uses of computer
 
Error Detection And Correction
Error Detection And CorrectionError Detection And Correction
Error Detection And Correction
 

Similar a Hadoop In The Real World

Integrating Hadoop into your enterprise IT environment
Integrating Hadoop into your enterprise IT environmentIntegrating Hadoop into your enterprise IT environment
Integrating Hadoop into your enterprise IT environmentMapR Technologies
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014MapR Technologies
 
Hadoop: Revolutionizing Analytics AND Operations
Hadoop: Revolutionizing Analytics AND OperationsHadoop: Revolutionizing Analytics AND Operations
Hadoop: Revolutionizing Analytics AND OperationsMapR Technologies
 
Key Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShareKey Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShareMapR Technologies
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixNicolas Morales
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR Technologies
 
MapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Technologies
 
Meruvian - Introduction to MapR
Meruvian - Introduction to MapRMeruvian - Introduction to MapR
Meruvian - Introduction to MapRThe World Bank
 
151116 Sedania Cloudera BDA Profile
151116 Sedania Cloudera BDA Profile151116 Sedania Cloudera BDA Profile
151116 Sedania Cloudera BDA ProfileZarul Zaabah
 
Why Business is Better in the Cloud
Why Business is Better in the CloudWhy Business is Better in the Cloud
Why Business is Better in the CloudPerficient, Inc.
 
Powering the "As it Happens" Business
Powering the "As it Happens" BusinessPowering the "As it Happens" Business
Powering the "As it Happens" BusinessMapR Technologies
 
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise WeAreEsynergy
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Cloudera, Inc.
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data AnalyticsMapR Technologies
 
Best Practices for Monitoring Cloud Networks
Best Practices for Monitoring Cloud NetworksBest Practices for Monitoring Cloud Networks
Best Practices for Monitoring Cloud NetworksThousandEyes
 
Bmc joe goldberg
Bmc joe goldbergBmc joe goldberg
Bmc joe goldbergBigDataExpo
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformCloudera, Inc.
 

Similar a Hadoop In The Real World (20)

Integrating Hadoop into your enterprise IT environment
Integrating Hadoop into your enterprise IT environmentIntegrating Hadoop into your enterprise IT environment
Integrating Hadoop into your enterprise IT environment
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
 
Hadoop: Revolutionizing Analytics AND Operations
Hadoop: Revolutionizing Analytics AND OperationsHadoop: Revolutionizing Analytics AND Operations
Hadoop: Revolutionizing Analytics AND Operations
 
Key Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShareKey Considerations for Putting Hadoop in Production SlideShare
Key Considerations for Putting Hadoop in Production SlideShare
 
Getting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with BluemixGetting started with Hadoop on the Cloud with Bluemix
Getting started with Hadoop on the Cloud with Bluemix
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
 
MapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data PlatformMapR Streams and MapR Converged Data Platform
MapR Streams and MapR Converged Data Platform
 
Meruvian - Introduction to MapR
Meruvian - Introduction to MapRMeruvian - Introduction to MapR
Meruvian - Introduction to MapR
 
151116 Sedania Cloudera BDA Profile
151116 Sedania Cloudera BDA Profile151116 Sedania Cloudera BDA Profile
151116 Sedania Cloudera BDA Profile
 
Why Business is Better in the Cloud
Why Business is Better in the CloudWhy Business is Better in the Cloud
Why Business is Better in the Cloud
 
Powering the "As it Happens" Business
Powering the "As it Happens" BusinessPowering the "As it Happens" Business
Powering the "As it Happens" Business
 
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Best Practices for Monitoring Cloud Networks
Best Practices for Monitoring Cloud NetworksBest Practices for Monitoring Cloud Networks
Best Practices for Monitoring Cloud Networks
 
Bmc joe goldberg
Bmc joe goldbergBmc joe goldberg
Bmc joe goldberg
 
Turning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data PlatformTurning Data into Business Value with a Modern Data Platform
Turning Data into Business Value with a Modern Data Platform
 

Más de MapR Technologies

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscapeMapR Technologies
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationMapR Technologies
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataMapR Technologies
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureMapR Technologies
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...MapR Technologies
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsMapR Technologies
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMapR Technologies
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsMapR Technologies
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageMapR Technologies
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionMapR Technologies
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...MapR Technologies
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareMapR Technologies
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsMapR Technologies
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Technologies
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsMapR Technologies
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLMapR Technologies
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainMapR Technologies
 
Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0MapR Technologies
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications MapR Technologies
 

Más de MapR Technologies (20)

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning Logistics
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Evolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQLEvolving from RDBMS to NoSQL + SQL
Evolving from RDBMS to NoSQL + SQL
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 
Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0Open Source Innovations in the MapR Ecosystem Pack 2.0
Open Source Innovations in the MapR Ecosystem Pack 2.0
 
How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications How Spark is Enabling the New Wave of Converged Cloud Applications
How Spark is Enabling the New Wave of Converged Cloud Applications
 

Último

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

Último (20)

Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

Hadoop In The Real World

  • 1. © 2014 MapR Technologies 1© 2014 MapR Technologies
  • 2. © 2014 MapR Technologies 2 MapR Overview BIG DATA BEST PRODUCT BUSINESS IMPACT Hadoop Top Ranked Production Success
  • 3. © 2014 MapR Technologies 3© 2014 MapR Technologies 3 Trends Forcing a revolution in enterprise architecture
  • 4. © 2014 MapR Technologies 4 Industry Leaders Compete and Win with Data1TREND More Data Beats Better Algorithms Collecting interaction data from ecommerce, social media, offline, and call centers enables a “customer 360 view” and consumer intimacy Competitive Advantage is Decided by 0.5% Consumer financial services: 1% improvement in fraud means hundreds of millions of dollars Advertising and retail: 0.5% improvement in lift means millions of dollars increase in profitability
  • 5. © 2014 MapR Technologies 5 Big Data is Overwhelming Traditional Systems • Mission-critical reliability • Transaction guarantees • Deep security • Real-time performance • Backup and recovery • Interactive SQL • Rich analytics • Workload management • Data governance • Backup and recovery Enterprise Data Architecture 2TREND ENTERPRISE USERS OPERATIONAL SYSTEMS ANALYTICAL SYSTEMS PRODUCTION REQUIREMENTS PRODUCTION REQUIREMENTS OUTSIDE SOURCES
  • 6. © 2014 MapR Technologies 6 Hadoop: The Disruptive Technology at the Core of Big Data3TREND JOB TRENDS FROM INDEED.COM Jan ‘06 Jan ‘12 Jan ‘14Jan ‘07 Jan ‘08 Jan ‘09 Jan ‘10 Jan ‘11 Jan ‘13
  • 7. © 2014 MapR Technologies 7© 2014 MapR Technologies And 3 Realities
  • 8. © 2014 MapR Technologies 8 OPERATIONAL SYSTEMS ANALYTICAL SYSTEMS ENTERPRISE USERS 1REALITY • Data staging • Archive • Data transformation • Data exploration • Streaming, interactions Hadoop Relieves the Pressure from Enterprise Systems 2 Interoperability 1 Reliability and DR 4 Supports operations and analytics 3 High performance Keys for Production Success
  • 9. © 2014 MapR Technologies 9 Hadoop is Being Used to Drive Small, Rapid Decisions2REALITY High Arrival Rate Data • Clickstream • Social media • Sensor data, … Business Impact • Revenue optimization • Risk mitigation • Operational efficiency
  • 10. © 2014 MapR Technologies 10 Architecture Matters for Success3REALITY FOUNDATION
  • 11. © 2014 MapR Technologies 11 FOUNDATION Architecture Matters for Success3REALITY Data protection & security High performance Multi-tenancy Workload management Open standards for integration NEW APPLICATIONS SLAs TRUSTEDINFORMATION LOWERTCO
  • 12. © 2014 MapR Technologies 12 World-Record Performance on Cisco UCS PREVIOUS RECORD: 1.6 TB with 2200 nodes 1.65 TBIN 1 MINUTE 298 NODES NEW MINUTESORT WORLD RECORD MapR: With a Fraction of the Hardware Previous Record Get the most out of your hardware infrastructure
  • 13. © 2014 MapR Technologies 13© 2014 MapR Technologies MapR: Hadoop Real World Examples
  • 14. © 2014 MapR Technologies 14 Largest Biometric Database in the World PEOPLE 20 BILLION BIOMETRICS National identification system in India for all citizens Fingerprint and retinal scan images and citizen data 1 trillion+ ID verifications per week, geographically dispersed across 8 data centers About 600m “residents” enrolled Requires 100ms response times; zero data loss and cross-datacenter replication
  • 15. © 2014 MapR Technologies 15 Helping Farmers: Software and Insurance • Help farmers protect and improve their farming operations • Use machine learning to predict weather & other agribusiness elements • Combine hyper-local weather monitoring, agronomic data modeling, and high-resolution weather simulations • Project weather for 2.5 years at every 20x20 plot across the US • Climatology simulations need to quickly experiment at small scale and then scale reliably • MapR Hadoop to analyze >10 trillion data points from 2.5million sensors • Faster machine learning performance enables more/faster simulations • MapR M7 enables geospatial database backed by Amazon S3 OBJECTIVES CHALLENGES SOLUTION Lower risk with new insurance products through better data analytics Business Impact “85% of farmer risk is weather-related. MapR has enabled us to provide a class of weather insurance that was not available before, helping farmers protect their operations.” IT Director, Climate Corporation
  • 16. © 2014 MapR Technologies 16 Cisco was able to analyze service sales opportunities in 1/10 the time, at 1/10 the cost, and generated $40 million in incremental service bookings in the first year. Cisco: 360° Customer View Cisco uses integrated customer data to increase revenues • Create shared view of customer & operations across 75,000 employees • Increase revenue opportunities with sales partners • Customer information was siloed in different divisions • Customer interactions were inconsistent and not satisfying • Missed opportunities for upselling/cross selling • Use MapR to collect customer information across touch points • Integrate billing, support, manufacturing, social media, websites, dial-in data • Generate new sales leads internally and for partners OBJECTIVES CHALLENGES SOLUTION Architecture for Sales Partner Opportunities Business Impact
  • 17. © 2014 MapR Technologies 17 Financial Services: Recommendation Engine & Real-time Targeting Making personalized real-time offers to credit card customers • Increase revenue and customer loyalty with real-time personalized offers • Increases revenue and improves customer experience through real-time targeting • A more flexible, scalable platform that’s a fraction of the cost of traditional technologies • Ensures reliability with MapR’s high availability and disaster recovery features • Many different CRM tools and siloed targeting engines • Developers and analysts are unable to access all customer data • Want to increase speed and relevance of recommendations • MapR M7 centralizes analytics and operational apps on one platform • Integrates all customer online and offline data into HBase in real-time: card member spend graph, merchant data, location, and feedback • Centralized customer data repository provides more accurate insights • Uses Mahout machine learning to provide real-time personalized offers OBJECTIVES CHALLENGES SOLUTION Business Impact GLOBAL FINANCIAL SERVICES CORPORATION
  • 18. © 2014 MapR Technologies 18 Rubicon Project: Ad Optimization Rubicon Project runs a real-time automated advertising platform • Create open ad platform for over 100K global advertising brands and over 500 of the world’s premium publishers • To keep up with their rapid growth, they needed to move to a fault-tolerant, high-availability Hadoop production system • Hadoop had become central to their operations but they were having problems with instability • Their 330-node Hadoop cluster processes 1M records/second • They chose MapR for enterprise features such as high availability, data protection and recoverability, disaster recovery, redundancy, and support OBJECTIVES CHALLENGES SOLUTION “Our company cannot run without Hadoop and MapR. We rely on MapR’s self-healing HA, disaster recovery and advanced monitoring features to conduct 90 billion real-time auctions on our global transaction platform.” Jan Gelin, VP of Engineering, Rubicon Project Business Impact
  • 19. © 2014 MapR Technologies 19 Operational Apps: Push Messaging Platform MapR: Enabling the “smartest, most aware, precise, easy-to-use, scalable, secure and powerful push messaging platform on the planet" • Enable organizations to build one-on-one brand relationships • Push messaging and geo-location targeting that • Support large numbers of customers in a multi-tenant platform • Target specific consumers in real time with relevant offers • Increase reliability of push messaging while lowering data center costs OBJECTIVES CHALLENGES SOLUTION • Increasing engagement and customer loyalty for 100’s of leading brands • Reduced hardware footprint by 50% • Consolidated 8 Hadoop clusters into 1 MapR cluster Business Impact • MapR Distribution for Hadoop with Apache HBase for operational workloads • Data placement control enables efficient cluster resource management
  • 20. © 2014 MapR Technologies 20© 2014 MapR Technologies Enterprise Data Hub Case Studies
  • 21. © 2014 MapR Technologies 21 Data Warehouse Optimization Improve data services to customers while reducing enterprise architecture costs • Provide cloud, security, managed services, data center, & comms • Report on customer usage, profiles, billing, and sales metrics • Improve service: Measure service quality and repair metrics • Reduce customer churn – identify and address IP network hotspots • Cost of ETL & DW storage for growing IP and clickstream data; >3 months • Reliability & cost of Hadoop alternatives limited ETL & storage offload • MapR Data Platform for data staging, ETL, and storage at 1/10th the cost • MapR provided smallest datacenter footprint with best DR solution • Enterprise-grade: NFS file management, consistent snapshots & mirroring OBJECTIVES CHALLENGES SOLUTION • Increased scale to handle network IP and clickstream data • Reduced workload on DW to maintain reporting SLA’s to business • Unlocked new insights into network usage and customer preferences Business Impact FORTUNE 100 TELCO
  • 22. © 2014 MapR Technologies 22 Mainframe Offload & Optimization Free up MIPS with Hadoop to Lower Cost and Modernize Data Architecture • Reduce costs: defer expensive mainframe upgrades and reduce MIPS • Maintain business SLA’s • Open standards: convert gradually to next-gen data architecture (Hadoop) • Connect and transform unique data formats (EBCDIC vs. ASCII) • Skills shortage: Hadoop and mainframe (COBOL & JCL) • Reliability and flexibility of alternate systems • Syncsort connectivity and data conversions on MapR • MapR uniquely handle small files without additional ETL steps to meet SLA • MapR only Hadoop distribution with reliability mainframe customers expect OBJECTIVES CHALLENGES SOLUTION  Reduce storage costs: Go from $100K/TB to $1K/TB by migrating data to Hadoop  Use MIPS wisely: Save average of $7K per MIPS by offloading batch jobs to Hadoop  Deliver powerful new insights: combine mainframe data with big data for deep insights Business Impact
  • 23. © 2014 MapR Technologies 23© 2014 MapR Technologies Security and Risk Mgmt. Case Studies
  • 24. © 2014 MapR Technologies 24 Solutionary: Managed Security Services Provider Threat detection on real-time streaming data via platform as a service (PaaS) • To address their growing customer base by processing trillions of messages (petabyte) per year while continuing to provide reliable security services • To improve data analytics by leveraging newer, more granular unstructured data sources ”MapR has taken Apache Hadoop to a new level of performance and manageability. It integrates into our systems seamlessly to help us boost the speed and capacity of data analytics for our clients.” - Dave Caplinger, Director of Architecture, Solutionary • Expanding existing database solution to meet demand was cost prohibitive • The existing technology could not process unstructured data at scale • Replaced RDBMS with MapR M7 to scale while retaining reliability requirements • Reduced time needed to investigate security events for relevance and impact • Improved data analytics, enabling new services and security analytics • 2x faster performance compared to competing solutions OBJECTIVES CHALLENGES SOLUTION Business Impact Leader in Magic Quadrant
  • 25. © 2014 MapR Technologies 25 Zions Bank: From SIEM to Fraud Detection Cost effective security analytics and fraud detection on one platform • To operationalize big data fraud detection: Fraud Operations and Security Analytics team at Zions maintains data stores, builds statistical models to detect fraud, and then uses these models to data mine and evaluate suspicious activity • (Global bank fraud costs $200B annually) “We initially got into centralizing all of our data from an information security perspective. We then saw that we could use this same environment to help with fraud detection” Michael Fowkes - SVP Fraud Operations and Security Analytics • Existing technology infrastructure could not scale • Timeliness of reports degraded over the last several years • Chose MapR and cut storage costs by 50% • Gained huge performance advantage – Querying time reduced from 24 hours to 30 min on 1.2 PB of data • Leverage MapR scale for increased model accuracy and deeper insights OBJECTIVES CHALLENGES SOLUTION Business Impact
  • 26. © 2014 MapR Technologies 26 Cisco: Global Security Intelligence Operations (MSSP) Operational and analytical security applications on one platform • To protect customer networks through early-warning intelligence & vulnerability analysis • To better react to evolving security threats in real-time • Collect additional telemetry data from customers' firewalls, intrusion prevention systems • Different analytical teams derived security intelligence in silos and lacked synergy • Inability to scale with existing infrastructure to a million events per second from nearly 100 different channels over tens of thousands of distributed sensors OBJECTIVES CHALLENGES SOLUTION Business Impact • All analytic teams leverage a common platform leading to operational efficiencies • Capability to scale - aggregating and analyzing millions of data points in real time • Update customer networks with new threat footprints within a 2 to 5 minute window • MapR M7: Central hub for all of the security analytics teams • Stream, interactive, graph and batch processing on MapR with the flexibility to perform closed-loop analytics across these functions in real time • Key Features: Scale, enterprise-grade, operational efficiency and high performance
  • 27. © 2014 MapR Technologies 27 Cisco SIO Hadoop Stack SENSOR DATA FIREWALL LOGS INTRUSION PROTECTION SYSTEM LOGS Globally Dispersed Datacenters SECURITY APPLIANCE LOGS SQL Queries and Reporting Batch Processing Graph Processing New Threat Footprint within 2-5 min Closed-Loop Operations Benefits: Unified platform for Analytics  Low Operational Costs  Faster Response Times  Better Algorithms MapR M7 Distribution for Hadoop 1 million events/sec. Over 100 channels Spark Streamin g for known threats & aggregation Mahout, MLLib Shark, Impala GraphX & TitanDB
  • 28. © 2014 MapR Technologies 28 MapR is the Hadoop Technology Leader BIG DATA HADOOP
  • 29. © 2014 MapR Technologies 29 MapR Distribution for Hadoop MapR Data Platform (Random Read/Write) Data HubEnterprise Grade Operational MapR-FS (POSIX) MapR-DB (High-Performance NoSQL) Security YARN Pig Cascading Spark Batch Spark Streaming Storm* Streaming HBase Solr NoSQL & Search Juju Provisioning & Coordination Savannah* Mahout MLLib ML, Graph GraphX MapReduc e v1 & v2 APACHE HADOOP AND OSS ECOSYSTEM EXECUTION ENGINES DATA GOVERNANCE AND OPERATIONS Workflow & Data GovernanceTez* Accumulo* Hive Impala Shark Drill* SQL Sentry* Oozie ZooKeeperSqoop Knox* WhirrFalcon*Flume Data Integration & Access HttpFS Hue NFS HDFS API HBase API JSON API
  • 30. © 2014 MapR Technologies 30 MapR Summary BIG DATA BEST PRODUCT BUSINESS IMPACT Hadoop Top Ranked Production Success
  • 31. © 2014 MapR Technologies 31 Q&A @mapr maprtech nitin@mapr.com Engage with us! MapR maprtech mapr-technologies

Notas del editor

  1. Thank you for the opportunity WWT Art Cisco
  2. Thank you for your time today. Today we’ll walk through a brief presentation to give you an overview of MapR. The high level summary of what we’ll talk about can be summarized in 3 points. Hadoop is the leading technology for Big Data platform with the power to transform customer’s business MapR gives you the most technologically advanced distribution for Hadoop MapR has the product, services, and partner network to ensure production success and continued success.
  3. Hadoop is making CIO’s rethink their data architecture. It is a fundamental shift in the economics of data storage/processing/analytics, and is opening up entirely new business opportunities. Let’s talk about 3 key trends we are seeing, as well as 3 realities or implications on your business and “readiness” to harness the power of big data and Hadoop.
  4. The first trend is that the industry leaders have shown how to use big data to compete and win in their markets. It’s no longer a nice to have – you need big data to compete Google pioneered MapReduce processing on commodity hardware and used that to catapult themselves to into the leading search engine even though they were 19th in the market Yahoo! Leveraged these ideas to create Hadoop to keep up with Google and many mainstream companies have followed with new data-driven applications such as “people you may know” (started by LinkedIN and now used by Facebook, Twitter, and every social application), product recommendation engines, contextual and personalized music services (beats), measuring digital media effectiveness (comScore), serving more relevant/targeted ads(Comcast, rubicon project), fraud and risk detection, healthcare efficacy, and more What makes the difference? A lot of attention is given to data science and developing sophisticated new algorithms, but in many cases just having more data beats better algorithms. (make point on collecting more consumer interaction as well as transaction data, as an example). In addition, competitive advantage is decided by very small percentages. Just 1% improvement in fraud can mean hundreds $millions in savings. A ½% lift in advertising effectiveness means millions in new product sales and profitability. The same can be applied to customer churn, disease diagnosis, and more.
  5. A second trend in enterprise architecture has been big data overwhelming the existing workload-specific systems which are in production. (list of requirements for each of these on the side in text) People started with mainframes or operational systems which run ERP, finance, CRM and other mission-critical applications. They require… (pick out attributes you want to stress on the left) You also have data warehouses, marts, data mining, and other analytical systems which pull data from these operational and other systems for providing insights to the business for decision making The amount/variety of data has been overloading these systems. You reach a certain point as you try to ingest new types of data when these systems are not cost-effective to scale to terabytes or petabytes of data
  6. Hadoop has become the defacto big data platform which allows organizations to keep up with big data and feed data-driven applications and processes This chart shows the percentage growth of jobs from Indeed.com. Compared to other popular technologies such as MongoDB and Cassandra, Hadoop is not only the fastest growing big data technology it’s one of the fastest growing technologies period. Hadoop has the most robust ecosystem and momentum and is the big data platform of choice for industry-leading, data-driven companies (Also of interest is that Indeed.com (which is a subsidiary of a Japanese-owned company) is a customer of MapR – they harness and analyze all of the job trends data using MapR)
  7. The first reality is that as people put Hadoop into production, to relieve the pressure from other systems in their enterprise architecture it needs to reliable . Hadoop needs to be held to the same enterprise standards as your Oracle, SAP, Teradata, NetApp storage, or any other enterprise system. Many organizations are putting Hadoop into their data center to provide (list of use cases underneath) … it can do all of this and more, but For Hadoop to act as a system of record , it must provide the same guarantees for SLA’s, performance, data protection, and more Most importantly, Hadoop has the potential for both analytics AND operations. It can be used to optimize the data warehouse provide batch data refining or storage. But Hadoop can provide many operational analytics or database operations/jobs when done right.
  8. In a recent article by Tom Davenport (http://www.cmswire.com/cms/big-data/5-things-to-lessen-your-anxiety-about-big-data-024382.php) – he says “Big data’s biggest wins come from making many small decisions vs. one that’s huge. The majority of big data driven decisions will be recurring, made at speed (in milliseconds), and at scale; actions will be taken automatically (vs. reviewed and approved by an individual). Examples include ad platforms making many constant adjustments, fraud detection on millions of transactions that are based on individual patterns, fleet management and routing taking into account current conditions…. This requires a Hadoop platform that can go beyond batch and support streaming writes so data can be constantly writing to the system while analysis is being conducted. High performance to meet the business needs and real-time operations the ability to perform online database operations to react to the business situation and impact business as it happens not report on it one week, month or quarter later. To do this requires THE RIGHT ARCHITECTURE
  9. 96% of US internet traffic Formerly used 2 other distros Went to MapR to meet very high SLA’s and performance
  10. Push messaging. Starbucks or ESPN applications, and others. MapR is the only software that they pay for. Have HBase committers on staff. Taken 8 applications clusters and moved into 1 MapR cluster; have 1 cluster with 8 sub-clusters running on different sets of nodes. Data placement control enables this. Went from 12 CDH servers and cut it down to 6. Just for HBase tables. (They won’t use M7 since they are HBase committers. )
  11. Verizon Teradata example Less than 10% of CDR’s analyzed
  12. More relevant local example Experian
  13. Solutionary is a Managed Security Services provider with services that include network intrusion detection
  14. ----- Meeting Notes (3/27/14 11:12) ----- Zions Bank Video - Phishing Attack
  15. http://www.datanami.com/datanami/2014-02-21/a_peek_inside_cisco_s_hadoop_security_machine.html
  16. 20 TB per day; 60 nodes, 1000 cores
  17. MapR is the Hadoop technology leader with over 500 paying customers and the largest production deployments in the world. People like to think of Yahoo, Facebook, and LinkedIn as big Hadoop users, and they are, but you would expect this because of their deep engineering heritage. Mainstream organizations who want to leverage Hadoop without hiring armies of engineers turn to MapR. We have the largest retailer, largest financial services deployment, largest media, healthcare, and government agencies Through a combination of Apache Hadoop community participation and a differentiated data platform, MapR lets organizations do more with Hadoop in both operational and analytical use cases that are expensive or impossible with other Hadoop distributions.
  18. Again, Hadoop is the leading technology for Big Data platform with the power to transform customer’s business MapR gives you the most technologically advanced distribution for Hadoop MapR has the product, services, and partner network to ensure production success and continued success. -