SlideShare una empresa de Scribd logo
1 de 9
Democratizing Big Data
Stefan Groschupf
Co-Founder & CTO




    © Datameer, Inc. 2010
Street Cred
                              Long time open source contributor




                             Zkclient
                             Aws-tasks

2    © Datameer, Inc. 2010
Who are we?

    Big data analytics leveraging the power and scale of Hadoop
    Started working on idea in 2008, formed company in 2009
    Headquartered in San Mateo, CA with office in Halle, Germany
    Funded by


    Management team from Yahoo!, Sun, Apple, Borland




3    © Datameer, Inc. 2010
Data grows rapidly


                                                 Unstructured




                                                   Structured



        Enterprise data doubles every three years (Forrester)
        Unstructured data grows at 61.7% CAGR (IDC)
        Structured data grows at 21.8%. (IDC)


4    © Datameer, Inc. 2010
Big Data Analytics Stack




    Infrastructure              Platform   Data   Analytics


5       © Datameer, Inc. 2010
Big Data for Anyone

    EMR, S3 < $100 to process TB
    Tools getting easier to use
    • Cascading / Pig vs MapReduce
    • Spreadsheets vs SQL




6    © Datameer, Inc. 2010
Discover Influencers for < $100



        Basic Auth


                                                              S3
           TwitterClient       Compression           Upload
             Thread              Thread              Thread


                               EC2 Server




                      256 MB                 50 MB




7    © Datameer, Inc. 2010
#JustinBieber




8    © Datameer, Inc. 2010
#Teaparty




9    © Datameer, Inc. 2010

Más contenido relacionado

La actualidad más candente

Protecting data privacy in analytics and machine learning ISACA London UK
Protecting data privacy in analytics and machine learning ISACA London UKProtecting data privacy in analytics and machine learning ISACA London UK
Protecting data privacy in analytics and machine learning ISACA London UK
Ulf Mattsson
 

La actualidad más candente (19)

AWS DC Summit - Data Led Migration
AWS DC Summit - Data Led MigrationAWS DC Summit - Data Led Migration
AWS DC Summit - Data Led Migration
 
Protecting data privacy in analytics and machine learning ISACA London UK
Protecting data privacy in analytics and machine learning ISACA London UKProtecting data privacy in analytics and machine learning ISACA London UK
Protecting data privacy in analytics and machine learning ISACA London UK
 
7 trends-for-big-data
7 trends-for-big-data7 trends-for-big-data
7 trends-for-big-data
 
Analytics, Everywhere. Keys to Effective Analytics and Data Discovery
Analytics, Everywhere. Keys to Effective Analytics and Data DiscoveryAnalytics, Everywhere. Keys to Effective Analytics and Data Discovery
Analytics, Everywhere. Keys to Effective Analytics and Data Discovery
 
Turning Data into Interactive Storytelling
Turning Data into Interactive StorytellingTurning Data into Interactive Storytelling
Turning Data into Interactive Storytelling
 
David Waxman Keynote
David Waxman KeynoteDavid Waxman Keynote
David Waxman Keynote
 
Accidental DataOps
Accidental DataOpsAccidental DataOps
Accidental DataOps
 
How to accelerate Splunk analytics
How to accelerate Splunk analyticsHow to accelerate Splunk analytics
How to accelerate Splunk analytics
 
Big Data
Big DataBig Data
Big Data
 
Qubole State of the Big Data Industry
Qubole State of the Big Data IndustryQubole State of the Big Data Industry
Qubole State of the Big Data Industry
 
AWS Big Data Analytics IP Expo 2013
AWS Big Data Analytics IP Expo 2013AWS Big Data Analytics IP Expo 2013
AWS Big Data Analytics IP Expo 2013
 
DOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud JourneyDOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud Journey
 
Why is hybrid cloud still so hard? 4 keys to unlock the future of IT
Why is hybrid cloud still so hard? 4 keys to unlock the future of ITWhy is hybrid cloud still so hard? 4 keys to unlock the future of IT
Why is hybrid cloud still so hard? 4 keys to unlock the future of IT
 
Scaling Your Data: Data Democratisation and DataOps
Scaling Your Data: Data Democratisation and DataOpsScaling Your Data: Data Democratisation and DataOps
Scaling Your Data: Data Democratisation and DataOps
 
The Benefits of Data Fabric
The Benefits of Data FabricThe Benefits of Data Fabric
The Benefits of Data Fabric
 
Exploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis KapsalisExploring the Wider World of Big Data- Vasalis Kapsalis
Exploring the Wider World of Big Data- Vasalis Kapsalis
 
Data science tips for data engineers
Data science tips for data engineersData science tips for data engineers
Data science tips for data engineers
 
IEEE 2014 JAVA DATA MINING PROJECTS Best peer++ a peer to-peer based large-sc...
IEEE 2014 JAVA DATA MINING PROJECTS Best peer++ a peer to-peer based large-sc...IEEE 2014 JAVA DATA MINING PROJECTS Best peer++ a peer to-peer based large-sc...
IEEE 2014 JAVA DATA MINING PROJECTS Best peer++ a peer to-peer based large-sc...
 
Evolving From Monolithic to Distributed Architecture Patterns in the Cloud
Evolving From Monolithic to Distributed Architecture Patterns in the CloudEvolving From Monolithic to Distributed Architecture Patterns in the Cloud
Evolving From Monolithic to Distributed Architecture Patterns in the Cloud
 

Similar a Stefan Groschupf of Datameer Gives Lightning Tallk at BigDataCamp

James Mesney_"Datameer's Big Data Analytics Platform"_April 9th_Data Enthusia...
James Mesney_"Datameer's Big Data Analytics Platform"_April 9th_Data Enthusia...James Mesney_"Datameer's Big Data Analytics Platform"_April 9th_Data Enthusia...
James Mesney_"Datameer's Big Data Analytics Platform"_April 9th_Data Enthusia...
Dataconomy Media
 
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not YearsReplatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
VMware Tanzu
 

Similar a Stefan Groschupf of Datameer Gives Lightning Tallk at BigDataCamp (20)

Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
 
Key note big data analytics ecosystem strategy
Key note   big data analytics ecosystem strategyKey note   big data analytics ecosystem strategy
Key note big data analytics ecosystem strategy
 
Modern data integration expert sessions
Modern data integration expert sessionsModern data integration expert sessions
Modern data integration expert sessions
 
Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar Modern Data Integration Expert Session Webinar
Modern Data Integration Expert Session Webinar
 
Connecting and Exploiting Big Data
Connecting and Exploiting Big DataConnecting and Exploiting Big Data
Connecting and Exploiting Big Data
 
Making Big Data Projects Successful - Data Science Pop-up Seattle
Making Big Data Projects Successful - Data Science Pop-up SeattleMaking Big Data Projects Successful - Data Science Pop-up Seattle
Making Big Data Projects Successful - Data Science Pop-up Seattle
 
Analyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop WebinarAnalyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop Webinar
 
Delivering Analytics at The Speed of Transactions with Data Fabric
Delivering Analytics at The Speed of Transactions with Data FabricDelivering Analytics at The Speed of Transactions with Data Fabric
Delivering Analytics at The Speed of Transactions with Data Fabric
 
Airbyte - Seed deck
Airbyte  - Seed deckAirbyte  - Seed deck
Airbyte - Seed deck
 
Big Data Security Analytics (BDSA) with Randy Franklin
Big Data Security Analytics (BDSA) with Randy FranklinBig Data Security Analytics (BDSA) with Randy Franklin
Big Data Security Analytics (BDSA) with Randy Franklin
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 
James Mesney_"Datameer's Big Data Analytics Platform"_April 9th_Data Enthusia...
James Mesney_"Datameer's Big Data Analytics Platform"_April 9th_Data Enthusia...James Mesney_"Datameer's Big Data Analytics Platform"_April 9th_Data Enthusia...
James Mesney_"Datameer's Big Data Analytics Platform"_April 9th_Data Enthusia...
 
Airbyte - Seed deck
Airbyte - Seed deckAirbyte - Seed deck
Airbyte - Seed deck
 
Data Led Migration
Data Led Migration Data Led Migration
Data Led Migration
 
How much money do you lose every time your ecommerce site goes down?
How much money do you lose every time your ecommerce site goes down?How much money do you lose every time your ecommerce site goes down?
How much money do you lose every time your ecommerce site goes down?
 
BI, Hive or Big Data Analytics?
BI, Hive or Big Data Analytics? BI, Hive or Big Data Analytics?
BI, Hive or Big Data Analytics?
 
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
AWS Summit Singapore 2019 | Big Data Analytics Architectural Patterns and Bes...
 
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing KeynoteArchitecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
Architecting Data For The Modern Enterprise - Data Summit 2017, Closing Keynote
 
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not YearsReplatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
Replatform your Teradata to a Next-Gen Cloud Data Platform in Weeks, Not Years
 
Cloudera 助力台灣大數據產業的發展
Cloudera 助力台灣大數據產業的發展Cloudera 助力台灣大數據產業的發展
Cloudera 助力台灣大數據產業的發展
 

Más de BigDataCamp

BigDataCamp LA 2014 Schedule
BigDataCamp LA 2014 ScheduleBigDataCamp LA 2014 Schedule
BigDataCamp LA 2014 Schedule
BigDataCamp
 
5 kinesis lightning
5 kinesis lightning5 kinesis lightning
5 kinesis lightning
BigDataCamp
 
4 hadoop for-the-disillusioned
4 hadoop for-the-disillusioned4 hadoop for-the-disillusioned
4 hadoop for-the-disillusioned
BigDataCamp
 
3 analytic strategies shree dandekar dell 12-10-13
3 analytic strategies shree dandekar dell 12-10-133 analytic strategies shree dandekar dell 12-10-13
3 analytic strategies shree dandekar dell 12-10-13
BigDataCamp
 
2 one spot redshift bigdatacamp 1.02
2 one spot redshift bigdatacamp 1.022 one spot redshift bigdatacamp 1.02
2 one spot redshift bigdatacamp 1.02
BigDataCamp
 
1 big datacampdell2013
1 big datacampdell20131 big datacampdell2013
1 big datacampdell2013
BigDataCamp
 

Más de BigDataCamp (11)

Ingest, Transform & Visualize w Amazon Web Services
Ingest, Transform & Visualize w Amazon Web ServicesIngest, Transform & Visualize w Amazon Web Services
Ingest, Transform & Visualize w Amazon Web Services
 
BigDataCamp LA 2014 Schedule
BigDataCamp LA 2014 ScheduleBigDataCamp LA 2014 Schedule
BigDataCamp LA 2014 Schedule
 
5 kinesis lightning
5 kinesis lightning5 kinesis lightning
5 kinesis lightning
 
4 hadoop for-the-disillusioned
4 hadoop for-the-disillusioned4 hadoop for-the-disillusioned
4 hadoop for-the-disillusioned
 
3 analytic strategies shree dandekar dell 12-10-13
3 analytic strategies shree dandekar dell 12-10-133 analytic strategies shree dandekar dell 12-10-13
3 analytic strategies shree dandekar dell 12-10-13
 
2 one spot redshift bigdatacamp 1.02
2 one spot redshift bigdatacamp 1.022 one spot redshift bigdatacamp 1.02
2 one spot redshift bigdatacamp 1.02
 
1 big datacampdell2013
1 big datacampdell20131 big datacampdell2013
1 big datacampdell2013
 
Stefan Groschupf of Datameer Gives Lightning Talk at BigDataCamp
Stefan Groschupf of Datameer Gives Lightning Talk at BigDataCampStefan Groschupf of Datameer Gives Lightning Talk at BigDataCamp
Stefan Groschupf of Datameer Gives Lightning Talk at BigDataCamp
 
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCampRichard Cole of Amazon Gives Lightning Tallk at BigDataCamp
Richard Cole of Amazon Gives Lightning Tallk at BigDataCamp
 
Sam Charrington Of Appistry Gives Lighting Talk
Sam Charrington Of Appistry Gives Lighting TalkSam Charrington Of Appistry Gives Lighting Talk
Sam Charrington Of Appistry Gives Lighting Talk
 
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCampSteve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
Steve Woolege Of Aster Data Gives Lightning Talk At BigDataCamp
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 

Stefan Groschupf of Datameer Gives Lightning Tallk at BigDataCamp

  • 1. Democratizing Big Data Stefan Groschupf Co-Founder & CTO © Datameer, Inc. 2010
  • 2. Street Cred Long time open source contributor Zkclient Aws-tasks 2 © Datameer, Inc. 2010
  • 3. Who are we? Big data analytics leveraging the power and scale of Hadoop Started working on idea in 2008, formed company in 2009 Headquartered in San Mateo, CA with office in Halle, Germany Funded by Management team from Yahoo!, Sun, Apple, Borland 3 © Datameer, Inc. 2010
  • 4. Data grows rapidly Unstructured Structured Enterprise data doubles every three years (Forrester) Unstructured data grows at 61.7% CAGR (IDC) Structured data grows at 21.8%. (IDC) 4 © Datameer, Inc. 2010
  • 5. Big Data Analytics Stack Infrastructure Platform Data Analytics 5 © Datameer, Inc. 2010
  • 6. Big Data for Anyone EMR, S3 < $100 to process TB Tools getting easier to use • Cascading / Pig vs MapReduce • Spreadsheets vs SQL 6 © Datameer, Inc. 2010
  • 7. Discover Influencers for < $100 Basic Auth S3 TwitterClient Compression Upload Thread Thread Thread EC2 Server 256 MB 50 MB 7 © Datameer, Inc. 2010
  • 8. #JustinBieber 8 © Datameer, Inc. 2010
  • 9. #Teaparty 9 © Datameer, Inc. 2010