SlideShare una empresa de Scribd logo
1 de 30
Descargar para leer sin conexión
Using Hadoop to
    Change Company Culture
                     Amy O’Connor
           Senior Director, Analytics



1
The Amazing Everyday




2
The Amazing Everyday




3
NOKIA’S HISTORY: 1865 TO Now




4
Data
    is our newest
     raw material


5
TRAFFIC
 US Government data
 suggests worsening
  conditions in urban
   areas…this year
average commute time
has risen by 9 minutes.

     Federal Highway
     Commission, Urban
     Congestion Report
     January 2011 to
     March 2011

 6
TRAFFIC
Global car sales are
 growing: there will
 be 50% more cars
sold in 10 years as
  there are today.

    IHS Automotive,
    October 2011




7
More data usually beats
       better algorithms
                    Anand Rajaramann




8
Devices in            •   Image Sensors
    use around            •   Accelerometers
    the world
                          •   Gyroscopes
                          •   Compasses
                          •   Pressure Sensors
           Probe points   •   Microphones
           collected
           monthly from
                          •   Light Sensors
           Nokia alone    •   Assisted GPS




9
18M         24


           80M
10
Data Storage & Analysis Landscape




11
Data Silos



     Traffic          Search              Consumer
     Probes            Logs                Profile
                Ad              Places               Device
               Data            Registry               Data




12
Smart Data

       Combining
         sets of
      behavioral &
     contextual data
13
A Good Way to Change
     Corporate Culture




14
Getting Children
     to Eat Peas…
     Tell them you expect them to
     eat their peas.
     Reward them with ice cream
     if they did.
     Explain why it’s good for
     them to eat their peas.
     Eat your own peas as a good
     role model.
                 Leann Lipps Birch,
                 Head of Human Development &
                 Family Studies
                 Pennsylvania State University
15
Getting Children
     to Eat Peas…
     Put them with children who
     love peas.

     Change the stories
     they tell.



                 Leann Lipps Birch,
                 Head of Human Development &
                 Family Studies
                 Pennsylvania State University
16
Identity                Media              Care & Marketing
       Consumer Profile     Products, Transactions     Device/User CRM

             SSO               Songs, Delivery          Activation Info

       Device Activation                                 Net Promotor
                              Advertising
          Campaigns                                        PC Suite
                                Ad Inventory

           Contacts
                            Campaign Promotions         Location
          Navteq                 Ad Canvas             Premium Content

        All Probes Data                                 Favorite Routes

          3D Imagery            Nokia                      Log Files

          Street View
                                Data                      Map Tiles

      Feature Recognition
                                Asset                      Imagery


     Device Programs              Social                  Search
           NAC, IIA              Social UGC                Log Files

       Windows Phone           Universal Share         Points of Interest

          Panel data            Journeys Data
                                                         Nokia IT
       Equipment Master           Event Info
                                                         Registrations

         Factory Data
                                                        Device Updates




17
Users
                                                         Analytics

      Decision-          Domain                             Offline         Predictive
       Makers           Expertise          Dashboards      Analysis         Analytics



       Data         Domain Expertise,                                       Key Value
      Analysts       Statistical Skills    TeraData                           Store


                    Domain Expertise,
        Data                               Oozie         Map          Pig        Hive
                     Statistical Skills,                Reduce
      Scientists
                    Computer Science


                                           Flume                                         HBas
                                                                                          e
                                                                               HDFS
     Developers/    Domain Expertise,      Scrib                                         HBas
     Applications   Computer Science         e                                            e

                                           FTP                                           HBas
                                                                                          e




18
Collaborative Working Model
      Present
                               Analytics

                                  Offline         Predictive
      Analyze    Dashboards      Analysis         Analytics
       and
     Aggregate

                                                  Key Value
                 TeraData                           Store
       Load


                 Oozie         Map          Pig        Hive
                              Reduce
     Transform



                 Flume                                         HBas
                                                                e
      Extract                                        HDFS
                 Scrib                                         HBas
                   e                                            e

                 FTP                                           HBas
     Platform                                                   e




19
Collaborative Working Model
                     To                           BI Tools:
      Present
                  DataOS,        Customer              Analytics
                                                   SPSS,     AD Hoc
                 Structured      Dashboard        Tableau,       Reporting
                    Data                          Cognos

                                                      Offline             Predictive
      Analyze     Hive QL
                                    Dashboards Mahout
                                     MR             Analysis              Analytics
                                                                     Rec,
       and        and Pig          Analysis       Machine
                                                                   Engines
     Aggregate    Queries            Job          Learning


                                                                          Key Value
                                    TeraData                       MR Agg   Store Agg
                                                                               MR
                                     User           Create
       Load      Metadata                                          Job and        Job and
                                   Metadata          Hive
                 Catalog                                           Data to         Data to
                                  Interfaces       Schema
                                                                    Oracle        Teradata

                                    Oozie            Map              Pig          Hive
                                   Monitor          Reduce
                  Create                            Data           Develop        Develop            Develop
     Transform   Library of          and           Model           Cleanse       Validation          Partition
                 MR Jobs           Manage         Definition         Job            Job                Job
                                  Transform

                                   Flume                                                              HBas
                  Catalog          Define
                                                   Monitor
                                                                  Integrate       Integrate             e
                                                                                                      Define
      Extract      Data           Standard
                                                     and
                                                   Manage
                                                                  Historical    HDFS
                                                                                 Streaming           Custom
                  Sources           ETL
                                   Scrib                            Data            Data              ETL
                                                                                                     HBas
                                                  Data Feed
                                      e                                                                 e

                                     FTP                                                              HBas
     Platform               Co-located developer clusters, pre-production cluster, product cluster     e


                                       Data OS        Product Teams and/or DataOS

20
Smart Data: Behavioral & Contextual
                                                          Analytics
     Aggregation
     • Update top searches table and
     Aggregation table in Oracle
       geo activity                         Dashboards
                                                             Offline         Predictive
                                                            Analysis         Analytics
     • Merge multiple data sources
     • Implement app logic
       (e.g.; round up                                                       Key Value
     Standard ETL
       latitude/longitude to                TeraData                           Store
     • Clean data, remove bad record
       3 decimals)
       and health checks
     • Partition data by date, hour, type                 Map
                                            Oozie                      Pig        Hive
     • Archive raw data                                  Reduce




                                            Flume                                         HBas
             Ad Router                                                                     e
                                                                                HDFS
                                            Scrib                                         HBas
                NAC                           e                                            e

                                            FTP                                           HBas
            Local Search                                                                   e




21
Merchant Portal Heatmap




22
Mapping the World




     23   © 2011 Nokia Company Confidential
23
Mapping the World




                           Probe data indicates the location, speed,
                           heading, time etc. about a mobile device.
                           Billions of probe records per week.
                                              Covers almost the entire world.
     24   © 2011 Nokia Company Confidential
24
Probe Density: Urban and Arterial




                                    100% AGR




25
792527719 (Jackson Blvd/Financial Pl)


      From Ref, One lane




26
27
24 Hours in Our Analytics Ecosystem


           ~2TB ingested
     350M messages via scribe
           >3000 MR jobs
          10TB processed


28
Spreading the Word


         Data Asset Catalog

     Smart Data Newsletter Stories

       Realtime Dashboards



29
The Amazing Everyday

          Thanks!



30

Más contenido relacionado

Destacado

Powering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache HadoopPowering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache HadoopCloudera, Inc.
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataCloudera, Inc.
 
20873918 nokia-organizational-behaviour
20873918 nokia-organizational-behaviour20873918 nokia-organizational-behaviour
20873918 nokia-organizational-behaviourMd Ahmed
 
Using Big Data to Drive Customer 360
Using Big Data to Drive Customer 360Using Big Data to Drive Customer 360
Using Big Data to Drive Customer 360Cloudera, Inc.
 
Collaborate to Win - Why Every Company Needs a Culture of Collaboration
Collaborate to Win - Why Every Company Needs a Culture of CollaborationCollaborate to Win - Why Every Company Needs a Culture of Collaboration
Collaborate to Win - Why Every Company Needs a Culture of CollaborationWrike
 
Aligning Strategy and Culture
Aligning Strategy and CultureAligning Strategy and Culture
Aligning Strategy and CultureHay Group India
 
Culture Eats Strategy for Breakfast - Greenspot by DartGroup Amsterdam - Cont...
Culture Eats Strategy for Breakfast - Greenspot by DartGroup Amsterdam - Cont...Culture Eats Strategy for Breakfast - Greenspot by DartGroup Amsterdam - Cont...
Culture Eats Strategy for Breakfast - Greenspot by DartGroup Amsterdam - Cont...Sirous Kavehercy
 
5 Steps for Building an Ideal Company culture (and what to watch out for!)
5 Steps for Building an Ideal Company culture (and what to watch out for!)5 Steps for Building an Ideal Company culture (and what to watch out for!)
5 Steps for Building an Ideal Company culture (and what to watch out for!)Qualtrics
 

Destacado (11)

Powering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache HadoopPowering the Internet of Things with Apache Hadoop
Powering the Internet of Things with Apache Hadoop
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big Data
 
20873918 nokia-organizational-behaviour
20873918 nokia-organizational-behaviour20873918 nokia-organizational-behaviour
20873918 nokia-organizational-behaviour
 
A Methodology for Building the Internet of Things
A Methodology for Building the Internet of ThingsA Methodology for Building the Internet of Things
A Methodology for Building the Internet of Things
 
Using Big Data to Drive Customer 360
Using Big Data to Drive Customer 360Using Big Data to Drive Customer 360
Using Big Data to Drive Customer 360
 
Nokia Strategy Presentation
Nokia Strategy PresentationNokia Strategy Presentation
Nokia Strategy Presentation
 
Nokia strategy and marketing
Nokia strategy and marketingNokia strategy and marketing
Nokia strategy and marketing
 
Collaborate to Win - Why Every Company Needs a Culture of Collaboration
Collaborate to Win - Why Every Company Needs a Culture of CollaborationCollaborate to Win - Why Every Company Needs a Culture of Collaboration
Collaborate to Win - Why Every Company Needs a Culture of Collaboration
 
Aligning Strategy and Culture
Aligning Strategy and CultureAligning Strategy and Culture
Aligning Strategy and Culture
 
Culture Eats Strategy for Breakfast - Greenspot by DartGroup Amsterdam - Cont...
Culture Eats Strategy for Breakfast - Greenspot by DartGroup Amsterdam - Cont...Culture Eats Strategy for Breakfast - Greenspot by DartGroup Amsterdam - Cont...
Culture Eats Strategy for Breakfast - Greenspot by DartGroup Amsterdam - Cont...
 
5 Steps for Building an Ideal Company culture (and what to watch out for!)
5 Steps for Building an Ideal Company culture (and what to watch out for!)5 Steps for Building an Ideal Company culture (and what to watch out for!)
5 Steps for Building an Ideal Company culture (and what to watch out for!)
 

Similar a Hadoop World 2011: Changing Company Culture with Hadoop - Amy O'Connor, Nokia

Hadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation ArchitecturesHadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation ArchitecturesDataWorks Summit
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingm_hepburn
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureOdinot Stanislas
 
Hadoop for shanghai dev meetup
Hadoop for shanghai dev meetupHadoop for shanghai dev meetup
Hadoop for shanghai dev meetupRoby Chen
 
Cetas Analytics as a Service for Predictive Analytics
Cetas Analytics as a Service for Predictive AnalyticsCetas Analytics as a Service for Predictive Analytics
Cetas Analytics as a Service for Predictive AnalyticsJ. David Morris
 
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase Sybase Türkiye
 
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of SybaseBig Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of SybaseBigDataCloud
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsHortonworks
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinerySteve Loughran
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranJAX London
 
Big Data: Beyond the "Bigness" and the Technology (webcast)
Big Data: Beyond the "Bigness" and the Technology (webcast)Big Data: Beyond the "Bigness" and the Technology (webcast)
Big Data: Beyond the "Bigness" and the Technology (webcast)Apigee | Google Cloud
 
Big Data = Big Decisions
Big Data = Big DecisionsBig Data = Big Decisions
Big Data = Big DecisionsInnoTech
 
Building Big Data Applications
Building Big Data ApplicationsBuilding Big Data Applications
Building Big Data ApplicationsRichard McDougall
 
Big Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage StrategyBig Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage StrategyHitachi Vantara
 
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and BeyondHadoop - Now, Next and Beyond
Hadoop - Now, Next and BeyondTeradata Aster
 

Similar a Hadoop World 2011: Changing Company Culture with Hadoop - Amy O'Connor, Nokia (20)

Hadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation ArchitecturesHadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation Architectures
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
 
Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform Architecture
 
Hadoop for shanghai dev meetup
Hadoop for shanghai dev meetupHadoop for shanghai dev meetup
Hadoop for shanghai dev meetup
 
Cetas Analytics as a Service for Predictive Analytics
Cetas Analytics as a Service for Predictive AnalyticsCetas Analytics as a Service for Predictive Analytics
Cetas Analytics as a Service for Predictive Analytics
 
Cetas Predictive Analytics Prezo
Cetas Predictive Analytics PrezoCetas Predictive Analytics Prezo
Cetas Predictive Analytics Prezo
 
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
 
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of SybaseBig Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
 
SAP HORTONWORKS
SAP HORTONWORKSSAP HORTONWORKS
SAP HORTONWORKS
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for Windows
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinery
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve Loughran
 
Big Data: Beyond the "Bigness" and the Technology (webcast)
Big Data: Beyond the "Bigness" and the Technology (webcast)Big Data: Beyond the "Bigness" and the Technology (webcast)
Big Data: Beyond the "Bigness" and the Technology (webcast)
 
Big data use cases
Big data use casesBig data use cases
Big data use cases
 
Big Data = Big Decisions
Big Data = Big DecisionsBig Data = Big Decisions
Big Data = Big Decisions
 
Building Big Data Applications
Building Big Data ApplicationsBuilding Big Data Applications
Building Big Data Applications
 
Big Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage StrategyBig Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage Strategy
 
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and BeyondHadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
 
Introducing Splunk – The Big Data Engine
Introducing Splunk – The Big Data EngineIntroducing Splunk – The Big Data Engine
Introducing Splunk – The Big Data Engine
 
hadoop @ Ibmbigdata
hadoop @ Ibmbigdatahadoop @ Ibmbigdata
hadoop @ Ibmbigdata
 

Más de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 

Más de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Último

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 

Último (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 

Hadoop World 2011: Changing Company Culture with Hadoop - Amy O'Connor, Nokia

  • 1. Using Hadoop to Change Company Culture Amy O’Connor Senior Director, Analytics 1
  • 5. Data is our newest raw material 5
  • 6. TRAFFIC US Government data suggests worsening conditions in urban areas…this year average commute time has risen by 9 minutes. Federal Highway Commission, Urban Congestion Report January 2011 to March 2011 6
  • 7. TRAFFIC Global car sales are growing: there will be 50% more cars sold in 10 years as there are today. IHS Automotive, October 2011 7
  • 8. More data usually beats better algorithms Anand Rajaramann 8
  • 9. Devices in • Image Sensors use around • Accelerometers the world • Gyroscopes • Compasses • Pressure Sensors Probe points • Microphones collected monthly from • Light Sensors Nokia alone • Assisted GPS 9
  • 10. 18M 24 80M 10
  • 11. Data Storage & Analysis Landscape 11
  • 12. Data Silos Traffic Search Consumer Probes Logs Profile Ad Places Device Data Registry Data 12
  • 13. Smart Data Combining sets of behavioral & contextual data 13
  • 14. A Good Way to Change Corporate Culture 14
  • 15. Getting Children to Eat Peas… Tell them you expect them to eat their peas. Reward them with ice cream if they did. Explain why it’s good for them to eat their peas. Eat your own peas as a good role model. Leann Lipps Birch, Head of Human Development & Family Studies Pennsylvania State University 15
  • 16. Getting Children to Eat Peas… Put them with children who love peas. Change the stories they tell. Leann Lipps Birch, Head of Human Development & Family Studies Pennsylvania State University 16
  • 17. Identity Media Care & Marketing Consumer Profile Products, Transactions Device/User CRM SSO Songs, Delivery Activation Info Device Activation Net Promotor Advertising Campaigns PC Suite Ad Inventory Contacts Campaign Promotions Location Navteq Ad Canvas Premium Content All Probes Data Favorite Routes 3D Imagery Nokia Log Files Street View Data Map Tiles Feature Recognition Asset Imagery Device Programs Social Search NAC, IIA Social UGC Log Files Windows Phone Universal Share Points of Interest Panel data Journeys Data Nokia IT Equipment Master Event Info Registrations Factory Data Device Updates 17
  • 18. Users Analytics Decision- Domain Offline Predictive Makers Expertise Dashboards Analysis Analytics Data Domain Expertise, Key Value Analysts Statistical Skills TeraData Store Domain Expertise, Data Oozie Map Pig Hive Statistical Skills, Reduce Scientists Computer Science Flume HBas e HDFS Developers/ Domain Expertise, Scrib HBas Applications Computer Science e e FTP HBas e 18
  • 19. Collaborative Working Model Present Analytics Offline Predictive Analyze Dashboards Analysis Analytics and Aggregate Key Value TeraData Store Load Oozie Map Pig Hive Reduce Transform Flume HBas e Extract HDFS Scrib HBas e e FTP HBas Platform e 19
  • 20. Collaborative Working Model To BI Tools: Present DataOS, Customer Analytics SPSS, AD Hoc Structured Dashboard Tableau, Reporting Data Cognos Offline Predictive Analyze Hive QL Dashboards Mahout MR Analysis Analytics Rec, and and Pig Analysis Machine Engines Aggregate Queries Job Learning Key Value TeraData MR Agg Store Agg MR User Create Load Metadata Job and Job and Metadata Hive Catalog Data to Data to Interfaces Schema Oracle Teradata Oozie Map Pig Hive Monitor Reduce Create Data Develop Develop Develop Transform Library of and Model Cleanse Validation Partition MR Jobs Manage Definition Job Job Job Transform Flume HBas Catalog Define Monitor Integrate Integrate e Define Extract Data Standard and Manage Historical HDFS Streaming Custom Sources ETL Scrib Data Data ETL HBas Data Feed e e FTP HBas Platform Co-located developer clusters, pre-production cluster, product cluster e Data OS Product Teams and/or DataOS 20
  • 21. Smart Data: Behavioral & Contextual Analytics Aggregation • Update top searches table and Aggregation table in Oracle geo activity Dashboards Offline Predictive Analysis Analytics • Merge multiple data sources • Implement app logic (e.g.; round up Key Value Standard ETL latitude/longitude to TeraData Store • Clean data, remove bad record 3 decimals) and health checks • Partition data by date, hour, type Map Oozie Pig Hive • Archive raw data Reduce Flume HBas Ad Router e HDFS Scrib HBas NAC e e FTP HBas Local Search e 21
  • 23. Mapping the World 23 © 2011 Nokia Company Confidential 23
  • 24. Mapping the World Probe data indicates the location, speed, heading, time etc. about a mobile device. Billions of probe records per week. Covers almost the entire world. 24 © 2011 Nokia Company Confidential 24
  • 25. Probe Density: Urban and Arterial 100% AGR 25
  • 26. 792527719 (Jackson Blvd/Financial Pl) From Ref, One lane 26
  • 27. 27
  • 28. 24 Hours in Our Analytics Ecosystem ~2TB ingested 350M messages via scribe >3000 MR jobs 10TB processed 28
  • 29. Spreading the Word Data Asset Catalog Smart Data Newsletter Stories Realtime Dashboards 29
  • 30. The Amazing Everyday Thanks! 30