SlideShare una empresa de Scribd logo
1 de 15
Cloudera:Cloudera:
Hadoop for the EnterpriseHadoop for the Enterprise
September 2008September 2008
Data Growing Much Faster thanData Growing Much Faster than
Moore’s LawMoore’s Law
04/21/17
Cloudera ConfidentialCloudera Confidential 22
Source: Richard Winter,
Why Are Data
Warehouses Growing
so Fast?, April 2008
Uniprocessor PerformanceUniprocessor Performance
04/21/17
33Cloudera ConfidentialCloudera Confidential
Founding TeamFounding Team
• Mike Olson, CEOMike Olson, CEO
– CEO SleepycatCEO Sleepycat
– Britton Lee, Illustra,Britton Lee, Illustra,
Informix, OracleInformix, Oracle
– BA, MS CS, BerkeleyBA, MS CS, Berkeley
• Amr Awadallah, CTO, VPAmr Awadallah, CTO, VP
EngineeringEngineering
– Founder Aptivia/VivaSmartFounder Aptivia/VivaSmart
– 8 years at Yahoo! running8 years at Yahoo! running
BI infrastructure, includingBI infrastructure, including
HadoopHadoop
– PhD EE, StanfordPhD EE, Stanford
• Christophe Bisciglia, VPChristophe Bisciglia, VP
TechnologyTechnology
– Created Google/NSFCreated Google/NSF
Hadoop cluster andHadoop cluster and
programprogram
– BA CS, U WashingtonBA CS, U Washington
• Jeff Hammerbacher, VPJeff Hammerbacher, VP
ProductProduct
– Ran world’s largestRan world’s largest
operational BI supportoperational BI support
system on Hadoop, atsystem on Hadoop, at
FacebookFacebook
– BA Mathematics, HarvardBA Mathematics, Harvard
04/21/17
44Cloudera ConfidentialCloudera Confidential
What Is Hadoop?What Is Hadoop?
• Core engine:Core engine:
– Open source implementation of Google’sOpen source implementation of Google’s
MapReduce and GFSMapReduce and GFS
– Hundreds or thousands of serversHundreds or thousands of servers
parallelize a data analysis taskparallelize a data analysis task
• Interfaces built on top of MapReduceInterfaces built on top of MapReduce
• Storage layer beneath (HDFS)Storage layer beneath (HDFS)
• Doug Cutting, Mike Cafarella areDoug Cutting, Mike Cafarella are
advisorsadvisors
04/21/17
55Cloudera ConfidentialCloudera Confidential
Hadoop is Open SourceHadoop is Open Source
• Hadoop is distributed under the Apache License:Hadoop is distributed under the Apache License:
– Reduces concern about lock-inReduces concern about lock-in
– Low-cost, effective distribution strategyLow-cost, effective distribution strategy
– Allows innovation by partners, customersAllows innovation by partners, customers
– Third-party inspection of source code providesThird-party inspection of source code provides
assurances on security, product qualityassurances on security, product quality
• Business-friendly license encourages commercialBusiness-friendly license encourages commercial
developmentdevelopment
– ““Open core” licensingOpen core” licensing
– Closed-source components, applicationsClosed-source components, applications
04/21/17
66Cloudera ConfidentialCloudera Confidential
Hadoop UsersHadoop Users
04/21/17
77Cloudera ConfidentialCloudera Confidential
Momentum: Google TrendsMomentum: Google Trends
04/21/17
88Cloudera ConfidentialCloudera Confidential
Netezza: $127M in FY08, $79M in FY07
Teradata: $830M in 1H08, $1.7B in FY07
Worldwide PhenomenonWorldwide Phenomenon
04/21/17
99Cloudera ConfidentialCloudera Confidential
Source:
Google Insights
world map for
searches on
“hadoop”,
Sept 2008.
Why is Hadoop Successful?Why is Hadoop Successful?
• BringsBrings computation closer to datacomputation closer to data
allowing both IO and computeallowing both IO and compute
scalability.scalability.
• Map-ReduceMap-Reduce forces developers toforces developers to thinkthink
in a parallel wayin a parallel way
• Operates onOperates on unstructured dataunstructured data , and, and
structured datastructured data (HBASE, HIVE)(HBASE, HIVE)
• Prescriptive developmentPrescriptive development , grows with, grows with
you without needing to re-architectyou without needing to re-architect
• Procedural languageProcedural language offers poweroffers power
04/21/17
1010Cloudera ConfidentialCloudera Confidential
Current Systems Isolate Users fromCurrent Systems Isolate Users from
the Event Level Raw Datathe Event Level Raw Data
File Server Farm for Warehouse (File Server Farm for Warehouse (non-queryablenon-queryable))
Warehouse Pre-ProcessingWarehouse Pre-Processing
InstrumentationInstrumentation
Log CollectionLog Collection
Datamart DatabaseDatamart Database
BI ReportingBI Reporting
MySQLMySQL
MemCachedMemCached
Live Web SiteLive Web SiteData MiningData Mining
R, Weka,R, Weka,
SAS, SPSSSAS, SPSS
ETLETL ETLETL ETLETL
ETLETL ETLETL ETLETL
Non-Consumption
Expensive ETL Grids
Expensive ETL Grids
04/21/17
1111Cloudera ConfidentialCloudera Confidential
Solution: “Smart” Storage ServiceSolution: “Smart” Storage Service
Smart Storage: Grid For File Storage & Data ProcessingSmart Storage: Grid For File Storage & Data Processing
Warehouse Pre-ProcessingWarehouse Pre-Processing
InstrumentationInstrumentation
Log CollectionLog Collection
Datamart DatabaseDatamart Database
BI ReportingBI Reporting
MySQLMySQL
MemCachedMemCached
Live Web SiteLive Web SiteData MiningData Mining
R, Weka,R, Weka,
SAS, SPSSSAS, SPSS
Enable Consumption
Eliminate Expensive
ETL Grids
Eliminate Expensive
ETL Grids
04/21/17
1212Cloudera ConfidentialCloudera Confidential
BDP versus OLAP/OLTPBDP versus OLAP/OLTP
Schema
Complexity
Processing
Freedom
Table Join Complexity
Concurrent
Jobs
Responsiveness
Per Job
Data Volume
Data Update
Pattern
100TB
Unstructured
100TB
1PB
Append OnlyRead/Write
100PB
Total Data Volume
Structured
SQL
Generic
Data
Processing
Batch
Interactive
1000
100 Tables
10PB
1PB
10PB
100PB
OLAP/OLTP
Batch Data
Processing
04/21/17
1313Cloudera ConfidentialCloudera Confidential
04/21/17
Cloudera ConfidentialCloudera Confidential 1414
Source:
Merrill Lynch
Industry
Overview,
May 7, 2008
Cloudera DifferentiatorsCloudera Differentiators
• Enabling Hadoop as an elastic platform withEnabling Hadoop as an elastic platform with
statistical multiplexing over many customersstatistical multiplexing over many customers
• Multi-Tenant Support:Multi-Tenant Support: Concurrency, Priority, NamespaceConcurrency, Priority, Namespace
Isolation, Performance Isolation.Isolation, Performance Isolation.
• Monitoring, Reliability, and AvailabilityMonitoring, Reliability, and Availability
• Resilience and Fast RecoveryResilience and Fast Recovery : A: A non-sexy problemnon-sexy problem
that isthat is critical to enterprisescritical to enterprises , no time to restart ETL job, no time to restart ETL job
from scratch, otherwise misses SLA.from scratch, otherwise misses SLA.
• IDEIDE to easilyto easily debug, deploy, and tune.debug, deploy, and tune.
• Integration withIntegration with data mining and analysisdata mining and analysis functionality (R,functionality (R,
Weka, SAS, SPSS)Weka, SAS, SPSS)
• Connector certificationConnector certification : another non-sexy problem that is: another non-sexy problem that is
ignored by community, make sure system is compatible withignored by community, make sure system is compatible with
other enterprise systems.other enterprise systems.
04/21/17
1515Cloudera ConfidentialCloudera Confidential

Más contenido relacionado

La actualidad más candente

Hive - Investor Deck
Hive - Investor DeckHive - Investor Deck
Hive - Investor DeckAlex Reed
 
Intercom's first pitch deck!
Intercom's first pitch deck!Intercom's first pitch deck!
Intercom's first pitch deck!Eoghan McCabe
 
Revolut pitch deck
Revolut pitch deckRevolut pitch deck
Revolut pitch deckTech in Asia
 
Foursquare's 1st Pitch Deck
Foursquare's 1st Pitch DeckFoursquare's 1st Pitch Deck
Foursquare's 1st Pitch DeckRami Al-Karmi
 
Tinder Pitch Deck
Tinder Pitch DeckTinder Pitch Deck
Tinder Pitch DeckRyan Gum
 
Scipher Medicine's $110M pitch deck for precision medicine
Scipher Medicine's $110M pitch deck for precision medicineScipher Medicine's $110M pitch deck for precision medicine
Scipher Medicine's $110M pitch deck for precision medicinePitch Decks
 
Mixpanel - Our pitch deck that we used to raise $65M
Mixpanel - Our pitch deck that we used to raise $65MMixpanel - Our pitch deck that we used to raise $65M
Mixpanel - Our pitch deck that we used to raise $65MSuhail Doshi
 
Lolo pitch deck series c
Lolo pitch deck series cLolo pitch deck series c
Lolo pitch deck series cPitch Decks
 
Dropbox's original pitch deck
Dropbox's original pitch deckDropbox's original pitch deck
Dropbox's original pitch deckPitch Decks
 
Mattermark 2nd (Final) Series A Deck
Mattermark 2nd (Final) Series A DeckMattermark 2nd (Final) Series A Deck
Mattermark 2nd (Final) Series A DeckDanielle Morrill
 
GTX's $25M pitch deck: new crypto exchange by 3AC & CoinFLEX founders
GTX's $25M pitch deck: new crypto exchange by 3AC & CoinFLEX foundersGTX's $25M pitch deck: new crypto exchange by 3AC & CoinFLEX founders
GTX's $25M pitch deck: new crypto exchange by 3AC & CoinFLEX foundersPitch Decks
 
Peter Thiel's Venture Capital Pitch Deck Template
Peter Thiel's Venture Capital Pitch Deck TemplatePeter Thiel's Venture Capital Pitch Deck Template
Peter Thiel's Venture Capital Pitch Deck TemplateAA BB
 
Finix pitch-deck
Finix pitch-deckFinix pitch-deck
Finix pitch-deckPPerksi
 
Rocket Internet Company Presentation
Rocket Internet Company PresentationRocket Internet Company Presentation
Rocket Internet Company PresentationNishan Bose
 
Zenpayroll Pitch Deck Template
Zenpayroll Pitch Deck TemplateZenpayroll Pitch Deck Template
Zenpayroll Pitch Deck TemplateJoseph Hsieh
 

La actualidad más candente (20)

Hive - Investor Deck
Hive - Investor DeckHive - Investor Deck
Hive - Investor Deck
 
Butlr
ButlrButlr
Butlr
 
Intercom's first pitch deck!
Intercom's first pitch deck!Intercom's first pitch deck!
Intercom's first pitch deck!
 
Revolut pitch deck
Revolut pitch deckRevolut pitch deck
Revolut pitch deck
 
Foursquare's 1st Pitch Deck
Foursquare's 1st Pitch DeckFoursquare's 1st Pitch Deck
Foursquare's 1st Pitch Deck
 
Lunar
LunarLunar
Lunar
 
Tinder Pitch Deck
Tinder Pitch DeckTinder Pitch Deck
Tinder Pitch Deck
 
Front Series B Deck
Front Series B DeckFront Series B Deck
Front Series B Deck
 
Scipher Medicine's $110M pitch deck for precision medicine
Scipher Medicine's $110M pitch deck for precision medicineScipher Medicine's $110M pitch deck for precision medicine
Scipher Medicine's $110M pitch deck for precision medicine
 
Mixpanel - Our pitch deck that we used to raise $65M
Mixpanel - Our pitch deck that we used to raise $65MMixpanel - Our pitch deck that we used to raise $65M
Mixpanel - Our pitch deck that we used to raise $65M
 
Lolo pitch deck series c
Lolo pitch deck series cLolo pitch deck series c
Lolo pitch deck series c
 
Dropbox's original pitch deck
Dropbox's original pitch deckDropbox's original pitch deck
Dropbox's original pitch deck
 
Mattermark 2nd (Final) Series A Deck
Mattermark 2nd (Final) Series A DeckMattermark 2nd (Final) Series A Deck
Mattermark 2nd (Final) Series A Deck
 
GTX's $25M pitch deck: new crypto exchange by 3AC & CoinFLEX founders
GTX's $25M pitch deck: new crypto exchange by 3AC & CoinFLEX foundersGTX's $25M pitch deck: new crypto exchange by 3AC & CoinFLEX founders
GTX's $25M pitch deck: new crypto exchange by 3AC & CoinFLEX founders
 
Peter Thiel's Venture Capital Pitch Deck Template
Peter Thiel's Venture Capital Pitch Deck TemplatePeter Thiel's Venture Capital Pitch Deck Template
Peter Thiel's Venture Capital Pitch Deck Template
 
LaunchRock
LaunchRockLaunchRock
LaunchRock
 
Finix pitch-deck
Finix pitch-deckFinix pitch-deck
Finix pitch-deck
 
Deel Presentation
Deel PresentationDeel Presentation
Deel Presentation
 
Rocket Internet Company Presentation
Rocket Internet Company PresentationRocket Internet Company Presentation
Rocket Internet Company Presentation
 
Zenpayroll Pitch Deck Template
Zenpayroll Pitch Deck TemplateZenpayroll Pitch Deck Template
Zenpayroll Pitch Deck Template
 

Similar a Cloudera's Original Pitch Deck from 2008

How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsCloudera, Inc.
 
Big data oracle_introduccion
Big data oracle_introduccionBig data oracle_introduccion
Big data oracle_introduccionFran Navarro
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesDataWorks Summit
 
DOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud JourneyDOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud JourneyHarald Erb
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
Unify Data at Memory Speed
Unify Data at Memory SpeedUnify Data at Memory Speed
Unify Data at Memory SpeedAlluxio, Inc.
 
Presentation architecting virtualized infrastructure for big data
Presentation   architecting virtualized infrastructure for big dataPresentation   architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big datasolarisyourep
 
Presentation architecting virtualized infrastructure for big data
Presentation   architecting virtualized infrastructure for big dataPresentation   architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big dataxKinAnx
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and ManufacturingCloudera, Inc.
 
The Architecture of Decoupling Compute and Storage with Alluxio
The Architecture of Decoupling Compute and Storage with AlluxioThe Architecture of Decoupling Compute and Storage with Alluxio
The Architecture of Decoupling Compute and Storage with AlluxioAlluxio, Inc.
 
The New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the CloudThe New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the CloudInside Analysis
 
What_to_expect_from_oracle_database_12c
What_to_expect_from_oracle_database_12cWhat_to_expect_from_oracle_database_12c
What_to_expect_from_oracle_database_12cMaria Colgan
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesDataWorks Summit
 
Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...
Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...
Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...Data Con LA
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016StampedeCon
 
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)Taewan Kim
 
Impala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopImpala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopCloudera, Inc.
 
Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalAvere Systems
 
Data-Centric Infrastructure for Agile Development
Data-Centric Infrastructure for Agile DevelopmentData-Centric Infrastructure for Agile Development
Data-Centric Infrastructure for Agile DevelopmentDATAVERSITY
 

Similar a Cloudera's Original Pitch Deck from 2008 (20)

Big Data: Myths and Realities
Big Data: Myths and RealitiesBig Data: Myths and Realities
Big Data: Myths and Realities
 
How to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of ThingsHow to Build Continuous Ingestion for the Internet of Things
How to Build Continuous Ingestion for the Internet of Things
 
Big data oracle_introduccion
Big data oracle_introduccionBig data oracle_introduccion
Big data oracle_introduccion
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
 
DOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud JourneyDOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud Journey
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Unify Data at Memory Speed
Unify Data at Memory SpeedUnify Data at Memory Speed
Unify Data at Memory Speed
 
Presentation architecting virtualized infrastructure for big data
Presentation   architecting virtualized infrastructure for big dataPresentation   architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big data
 
Presentation architecting virtualized infrastructure for big data
Presentation   architecting virtualized infrastructure for big dataPresentation   architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big data
 
Hadoop and Manufacturing
Hadoop and ManufacturingHadoop and Manufacturing
Hadoop and Manufacturing
 
The Architecture of Decoupling Compute and Storage with Alluxio
The Architecture of Decoupling Compute and Storage with AlluxioThe Architecture of Decoupling Compute and Storage with Alluxio
The Architecture of Decoupling Compute and Storage with Alluxio
 
The New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the CloudThe New Database Frontier: Harnessing the Cloud
The New Database Frontier: Harnessing the Cloud
 
What_to_expect_from_oracle_database_12c
What_to_expect_from_oracle_database_12cWhat_to_expect_from_oracle_database_12c
What_to_expect_from_oracle_database_12c
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management Challenges
 
Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...
Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...
Data Con LA 2018 - A tale of two BI standards: Data warehouses and data lakes...
 
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
Best Practices For Building and Operating A Managed Data Lake - StampedeCon 2016
 
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)
13회 Oracle Developer Meetup 발표 자료: Oracle Cloud Data Interface(2019.07.20)
 
Impala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopImpala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on Hadoop
 
Solving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute finalSolving enterprise challenges through scale out storage & big compute final
Solving enterprise challenges through scale out storage & big compute final
 
Data-Centric Infrastructure for Agile Development
Data-Centric Infrastructure for Agile DevelopmentData-Centric Infrastructure for Agile Development
Data-Centric Infrastructure for Agile Development
 

Último

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Último (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Cloudera's Original Pitch Deck from 2008

  • 1. Cloudera:Cloudera: Hadoop for the EnterpriseHadoop for the Enterprise September 2008September 2008
  • 2. Data Growing Much Faster thanData Growing Much Faster than Moore’s LawMoore’s Law 04/21/17 Cloudera ConfidentialCloudera Confidential 22 Source: Richard Winter, Why Are Data Warehouses Growing so Fast?, April 2008
  • 4. Founding TeamFounding Team • Mike Olson, CEOMike Olson, CEO – CEO SleepycatCEO Sleepycat – Britton Lee, Illustra,Britton Lee, Illustra, Informix, OracleInformix, Oracle – BA, MS CS, BerkeleyBA, MS CS, Berkeley • Amr Awadallah, CTO, VPAmr Awadallah, CTO, VP EngineeringEngineering – Founder Aptivia/VivaSmartFounder Aptivia/VivaSmart – 8 years at Yahoo! running8 years at Yahoo! running BI infrastructure, includingBI infrastructure, including HadoopHadoop – PhD EE, StanfordPhD EE, Stanford • Christophe Bisciglia, VPChristophe Bisciglia, VP TechnologyTechnology – Created Google/NSFCreated Google/NSF Hadoop cluster andHadoop cluster and programprogram – BA CS, U WashingtonBA CS, U Washington • Jeff Hammerbacher, VPJeff Hammerbacher, VP ProductProduct – Ran world’s largestRan world’s largest operational BI supportoperational BI support system on Hadoop, atsystem on Hadoop, at FacebookFacebook – BA Mathematics, HarvardBA Mathematics, Harvard 04/21/17 44Cloudera ConfidentialCloudera Confidential
  • 5. What Is Hadoop?What Is Hadoop? • Core engine:Core engine: – Open source implementation of Google’sOpen source implementation of Google’s MapReduce and GFSMapReduce and GFS – Hundreds or thousands of serversHundreds or thousands of servers parallelize a data analysis taskparallelize a data analysis task • Interfaces built on top of MapReduceInterfaces built on top of MapReduce • Storage layer beneath (HDFS)Storage layer beneath (HDFS) • Doug Cutting, Mike Cafarella areDoug Cutting, Mike Cafarella are advisorsadvisors 04/21/17 55Cloudera ConfidentialCloudera Confidential
  • 6. Hadoop is Open SourceHadoop is Open Source • Hadoop is distributed under the Apache License:Hadoop is distributed under the Apache License: – Reduces concern about lock-inReduces concern about lock-in – Low-cost, effective distribution strategyLow-cost, effective distribution strategy – Allows innovation by partners, customersAllows innovation by partners, customers – Third-party inspection of source code providesThird-party inspection of source code provides assurances on security, product qualityassurances on security, product quality • Business-friendly license encourages commercialBusiness-friendly license encourages commercial developmentdevelopment – ““Open core” licensingOpen core” licensing – Closed-source components, applicationsClosed-source components, applications 04/21/17 66Cloudera ConfidentialCloudera Confidential
  • 7. Hadoop UsersHadoop Users 04/21/17 77Cloudera ConfidentialCloudera Confidential
  • 8. Momentum: Google TrendsMomentum: Google Trends 04/21/17 88Cloudera ConfidentialCloudera Confidential Netezza: $127M in FY08, $79M in FY07 Teradata: $830M in 1H08, $1.7B in FY07
  • 9. Worldwide PhenomenonWorldwide Phenomenon 04/21/17 99Cloudera ConfidentialCloudera Confidential Source: Google Insights world map for searches on “hadoop”, Sept 2008.
  • 10. Why is Hadoop Successful?Why is Hadoop Successful? • BringsBrings computation closer to datacomputation closer to data allowing both IO and computeallowing both IO and compute scalability.scalability. • Map-ReduceMap-Reduce forces developers toforces developers to thinkthink in a parallel wayin a parallel way • Operates onOperates on unstructured dataunstructured data , and, and structured datastructured data (HBASE, HIVE)(HBASE, HIVE) • Prescriptive developmentPrescriptive development , grows with, grows with you without needing to re-architectyou without needing to re-architect • Procedural languageProcedural language offers poweroffers power 04/21/17 1010Cloudera ConfidentialCloudera Confidential
  • 11. Current Systems Isolate Users fromCurrent Systems Isolate Users from the Event Level Raw Datathe Event Level Raw Data File Server Farm for Warehouse (File Server Farm for Warehouse (non-queryablenon-queryable)) Warehouse Pre-ProcessingWarehouse Pre-Processing InstrumentationInstrumentation Log CollectionLog Collection Datamart DatabaseDatamart Database BI ReportingBI Reporting MySQLMySQL MemCachedMemCached Live Web SiteLive Web SiteData MiningData Mining R, Weka,R, Weka, SAS, SPSSSAS, SPSS ETLETL ETLETL ETLETL ETLETL ETLETL ETLETL Non-Consumption Expensive ETL Grids Expensive ETL Grids 04/21/17 1111Cloudera ConfidentialCloudera Confidential
  • 12. Solution: “Smart” Storage ServiceSolution: “Smart” Storage Service Smart Storage: Grid For File Storage & Data ProcessingSmart Storage: Grid For File Storage & Data Processing Warehouse Pre-ProcessingWarehouse Pre-Processing InstrumentationInstrumentation Log CollectionLog Collection Datamart DatabaseDatamart Database BI ReportingBI Reporting MySQLMySQL MemCachedMemCached Live Web SiteLive Web SiteData MiningData Mining R, Weka,R, Weka, SAS, SPSSSAS, SPSS Enable Consumption Eliminate Expensive ETL Grids Eliminate Expensive ETL Grids 04/21/17 1212Cloudera ConfidentialCloudera Confidential
  • 13. BDP versus OLAP/OLTPBDP versus OLAP/OLTP Schema Complexity Processing Freedom Table Join Complexity Concurrent Jobs Responsiveness Per Job Data Volume Data Update Pattern 100TB Unstructured 100TB 1PB Append OnlyRead/Write 100PB Total Data Volume Structured SQL Generic Data Processing Batch Interactive 1000 100 Tables 10PB 1PB 10PB 100PB OLAP/OLTP Batch Data Processing 04/21/17 1313Cloudera ConfidentialCloudera Confidential
  • 14. 04/21/17 Cloudera ConfidentialCloudera Confidential 1414 Source: Merrill Lynch Industry Overview, May 7, 2008
  • 15. Cloudera DifferentiatorsCloudera Differentiators • Enabling Hadoop as an elastic platform withEnabling Hadoop as an elastic platform with statistical multiplexing over many customersstatistical multiplexing over many customers • Multi-Tenant Support:Multi-Tenant Support: Concurrency, Priority, NamespaceConcurrency, Priority, Namespace Isolation, Performance Isolation.Isolation, Performance Isolation. • Monitoring, Reliability, and AvailabilityMonitoring, Reliability, and Availability • Resilience and Fast RecoveryResilience and Fast Recovery : A: A non-sexy problemnon-sexy problem that isthat is critical to enterprisescritical to enterprises , no time to restart ETL job, no time to restart ETL job from scratch, otherwise misses SLA.from scratch, otherwise misses SLA. • IDEIDE to easilyto easily debug, deploy, and tune.debug, deploy, and tune. • Integration withIntegration with data mining and analysisdata mining and analysis functionality (R,functionality (R, Weka, SAS, SPSS)Weka, SAS, SPSS) • Connector certificationConnector certification : another non-sexy problem that is: another non-sexy problem that is ignored by community, make sure system is compatible withignored by community, make sure system is compatible with other enterprise systems.other enterprise systems. 04/21/17 1515Cloudera ConfidentialCloudera Confidential

Notas del editor

  1. (Moore’s law is failing, only way to speed up going forward is massive parallelism on grids/multicores).
  2. Furthermore, these expensive ETL grids are only needed a couple of hours in the morning to meet the loading SLA.
  3. Another pain point is resilience to failure: currently when a hadoop job fails you have to restart it all the way from beginning. The community is not spending much time addressing this problem since it is not "sexy", but it is critical for enterprises with strict SLAs to meet. You don't want to have to restart your ETL job from scratch when a failure occurs, there is no time for that. There is a need to snapshot the jobs at intermediate checkpoints so that you don't have to restart all way from beginning in case of failure.