SlideShare una empresa de Scribd logo
1 de 30
Big Data and Society

Dirk deRoos
IBM World Wide Technical Sales, IBM Big Data Platform
dderoos@ca.ibm.com
   @Dirk_deRoos




                                                        © 2012 IBM Corporation
Agenda



    • What is Big Data?

    • What will our ability to analyze Big Data lead to?

    • What should we do to prepare?




2                                                          © 2012 IBM Corporation
What is Big Data?

Stories and Definitions




                          © 2012 IBM Corporation
Harnessing the Largest Predictive Focus Group in the World


    Purpose
     • Understand public sentiment towards an event:
       movie trailers
     • Deeply understand the potential customer profile:
       gender, occupation, intent to watch
     • Alter marketing launch plans based on insight



    Background
     • 1.1 Billion Tweets analyzed
     • 5.7 Million blogs/forum posts
     • 3.5 million messages
     • Also: Facebook, Google+, Tumblr, Flickr




4                                                          © 2012 IBM Corporation
Media & Entertainment Social Media Analytics




5                                              © 2012 IBM Corporation
Asian telco reduces
    billing costs and improves
    customer satisfaction




    Real-time mediation and analysis of
     6B CDRs per day
    Data processing time reduced from
     12 hrs to 1 sec

    Hardware cost reduced to 1/8th
    Proactively address issues
      (e.g. dropped calls) impacting
      customer satisfaction
6                             © 2012 IBM Corporation
Pacific Northwest Smart Grid
Demonstration Project


Capabilities:

     Stream Computing – real-time
     control system

     Deep Analytics Appliance – analyze
     massive data sets




Demonstrates scalability from 100
to 500K homes while retaining 10
years’ historical data



Accommodates ad hoc analysis of price
fluctuation, energy consumption profiles,
risk, fraud detection, grid health, etc.


 7                                          © 2012 IBM Corporation
Watson’s advanced analytic capabilities can sort through the equivalent of
8
    200 MILLION pages of data to uncover an answer in 3 SECONDS.Corporation
                                                               © 2012 IBM
The Jeopardy! Challenge – Question Answering Solution


         Broad/Open               $200                       $1000
           Domain         If you're standing, it's      The first person
                        the direction you should     mentioned by name in
                                                      ‘The Man in the Iron
                          look to check out the
    Complex                    wainscoting.
                                                     Mask’ is this hero of a
                                                     previous book by the
    Language                                             same author.

            High
          Precision                                          $2000
                                 $600
                                                     Of the 4 countries in the
                        In cell division, mitosis
     Accurate                                        world that the U.S. does
                          splits the nucleus &
                                                       not have diplomatic
    Confidence           cytokinesis splits this      relations with, the one
                         liquid cushioning the         that’s farthest north
                                 nucleus
             High
            Speed
9                                                                © 2012 IBM Corporation
Brief History of IBM Watson
       IBM        Jeopardy!         Watson              Watson           Watson
     Research       Grand             for             for Financial      Industry
      Project     Challenge        Healthcare           Services         Solutions
     (2006 – )    (Feb 2011)       (Aug 2011 –)       (Mar 2012 – )       (2012 – )




                                                                       Cross-industry
                                                       Expansion        Applications

                                 Commercialization
                 Demonstration                       From inspiration and invention,
                                                             through innovation and
       R&D                                             industrialization, ending with
                                                            industry transformation.
10                                                                       © 2012 IBM Corporation
IBM Watson brings together a set of transformational
technologies to drive optimized outcomes

                                                  2 Generates and
                                                          evaluates
                                                    hypothesis for
1 Understands                                      better outcomes
     natural                                                        99%
     language and                                                   60%
                                                                    10%

     human speech




      3 Adapts and
        Learns from                       …built on a massively parallel
        user selections                    probabilistic evidence-based
        and responses               architecture optimized for POWER7


11                                                            © 2012 IBM Corporation
Healthcare industry is beset with some of the most complex
information challenges we collectively face


        Medical information
        is doubling every 5
        years, much of
        which is
        unstructured

        81% of physicians
        report spending 5
        hours or less per
        month reading
        medical journals

     “Medicine has become too complex (and only) about 20% of the knowledge clinicians use
     today is evidence-base.”   Steven Shapiro, Chief Medical & Scientific Officer, UPMC

12   Source: International Journal of Circumpolar Health,                               © 2012 IBM Corporation
     DoctorDirectory.com, Institute for Medicine"
IBM and Seton Health put
     Ready for Watson to work

     • Proactively target care management that
       reduces re-admission of congestive heart
       failure patients
     • Improve patient quality of life, reduce cost
       and mortality rates
     • Analyze unstructured data (e.g., physician
       notes) and provide an integrated view of
       clinical and operational information
     • Analysis revealed:
          – 18 top indicators determined
          – 2 key re-admission factors




13                                   © 2012 IBM Corporation
Traditional Approach                                   New Approach
                Structured, analytical, logical                 Creative, holistic thought, intuition




                            Data                                          Hadoop
                          Warehouse                                       Streams
                                                                                               Web Logs
       Transaction Data
                                                                                                  Social Data
     Internal App Data   Structured                                 Unstructured
                                                  Enterprise
                         Repeatable               Integration        Exploratory            Text Data: emails
     Mainframe Data        Linear                                     Iterative
                                                                                       Sensor data: images
           ERP data
                           Traditional                                    New                   RFID
                            Sources                                      Sources




14                                                                                                  © 2012 IBM Corporation
1TB = 1000GB = 108 DVD movies
 1PB = 1000TB = 108,000 DVD movies
                                                                                 4.6
                                                 30 billion RFID             billion
                                                    tags today
                                                                              camera
                         12+ TBs                   (1.3B in 2005)
                                                                              phones
                        of tweet data                                           world
                          every day                                              wide



                                                                          100s of
                                                                         millions
                                                                          of GPS
       data every day
? TBs of




                                                                         enabled
                                                                             devices
                                                                                sold
                                                                            annually

                               25+ TBs of                                         2+
                                  log data                                   billion
                                 every day                                   people
                                                                             on the
                                             76 million smart               Web by
                                             meters in 2009…               end 2011
                                              200M by 2014
15                                                                  © 2012 IBM Corporation
The Big Data Opportunity
Extracting insight from an immense volume, variety and velocity of data,
            in context, beyond what was previously possible.




 16                                                          © 2012 IBM Corporation
What can you do with Big Data?

        Financial Services             Utilities
        • Fraud detection              • Weather analysis
        • 360° View of the Customer    • Smart grid management



 Transportation                                    IT
 • Logistics optimization                          • System Log Analysis
 • Traffic congestion                              • Cybersecurity




 Health & Life Sciences
                                                   Retail
 • Epidemic early warning                          • 360° View of the
 • ICU monitoring                                    Customer
                                                   • Real-time promotions

            Telecommunications         Law Enforcement
            • Geomapping / marketing   • Multimodal surveillance
            • Network monitoring       • Cyber security detection


17                                                           © 2012 IBM Corporation
Why Didn’t We Use All of the Big Data Before?




 18                                             © 2012 IBM Corporation
What will our ability to Analyze Big Data
lead to?
Changes in Media, Changes in Mentality




                                            © 2012 IBM Corporation
Rise of the      Data Scientist Role in Organizations

     • What is a Data Scientist?
          •Statistics expert
          •Text analytics expert
          •Data integration expert

          •Someone who specializes in exploring data
          in new ways, looking for value


     • This will become a normal role in organizations




20                                                       © 2012 IBM Corporation
More Information will be Accessible to More People

     • Large organizations will not be able to control the world’s data
         •Not about WikiLeaks
         •Many information gathering sources


     • More People will be able to Perform Deep Analysis
         •Data Scientist skills and tools will increasingly be available at a lower cost
              • Cloud technology
              • Lower cost, higher capacity hardware




21                                                                                    © 2012 IBM Corporation
Analytics has Evolved from Business Initiative to
Business Imperative

         Analytically sophisticated companies outperform their competition


         Respondents who say analytics                                                          Organizations achieving
         creates a competitive advantage                                                        a competitive advantage
                                                                                                with analytics are

2010                              37%                57%
                                                    increase                                    2.2x
                                                                                                more likely to
                                                                                                substantially outperform
2011                                                     58%                                    their industry peers




     Source: The New Intelligent Enterprise, a joint MIT Sloan Management Review and IBM Institute of Business Value analytics
22   research partnership. Copyright © Massachusetts Institute of Technology 2011                                                © 2012 IBM Corporation
Accelerated Pace of Change




23                           © 2012 IBM Corporation
Volatility in Media Usage




24                          © 2012 IBM Corporation
Changes in Societal View of Privacy

     • Privacy is a social construction
           •Changed through history


     • Growing acceptance of individual and aggregate monitoring
           •69% of Americans store personal data in the cloud. (Pew Internet, 2008)
           •85% of Americans own a cell phone. (Pew Internet, 2011)




25   Source: Pew Internet, 2008, 2011)                                                © 2012 IBM Corporation
What Should We Do?

Watch, Think, Legislate




                          © 2012 IBM Corporation
Quest for Governance of Big Data


     • Businesses seek maximum capitalization of Big Data
          •Retail
               • Increased tracking of customer behavior

          •Telco/Internet
                • Individual profiling based on personal browsing or calling history
                • Rights restrictions for digital media – pay per use

     • Governments seek maximum well-being of society through Big Data
          •Security
               • Everything can be monitored

          •Health care
                • Centralize all records
                • Genomic data

     • Individuals seek protection from having their data used against them


27                                                                                     © 2012 IBM Corporation
Impact of Technological Change is Unpredictable




28                                                © 2012 IBM Corporation
As Changes Occur, Society Must React



     • What kind of world do we want?
         •Opportunity for business profits
         •Greater good of society
         •Rights of individual / freedom


     • Constant forces
         •Markets will always strive for efficiency
         •Technological changes are always coming


     • Balance priorities




29                                                    © 2012 IBM Corporation
30   © 2012 IBM Corporation

Más contenido relacionado

La actualidad más candente

Data Center Decisions: Build Versus Buy
Data Center Decisions: Build Versus BuyData Center Decisions: Build Versus Buy
Data Center Decisions: Build Versus BuyVISIHOSTING
 
Big Data and Content Management. SkyDox and the European Court of Human Righ...
Big Data and Content Management.  SkyDox and the European Court of Human Righ...Big Data and Content Management.  SkyDox and the European Court of Human Righ...
Big Data and Content Management. SkyDox and the European Court of Human Righ...SkyDox LTD
 
Infrastructure software 2011 2012
Infrastructure software 2011 2012Infrastructure software 2011 2012
Infrastructure software 2011 2012MMMTechLaw
 
15h00 intel - intel big data for aws summits rev3
15h00   intel - intel big data for aws summits rev315h00   intel - intel big data for aws summits rev3
15h00 intel - intel big data for aws summits rev3infolive
 
Durnat law numbers
Durnat law numbersDurnat law numbers
Durnat law numbersactkm
 
Enterprise Energy Management using a Linked Dataspace for Energy Intelligence
Enterprise Energy Management using a Linked Dataspace for Energy IntelligenceEnterprise Energy Management using a Linked Dataspace for Energy Intelligence
Enterprise Energy Management using a Linked Dataspace for Energy IntelligenceEdward Curry
 
Challenges for Open Semantic Service Networks : models, theory, applications
Challenges for Open Semantic Service Networks: models, theory, applications Challenges for Open Semantic Service Networks: models, theory, applications
Challenges for Open Semantic Service Networks : models, theory, applications Jorge Cardoso
 
Rob anderson
Rob andersonRob anderson
Rob andersonEduserv
 
NEW: TrendConnect: Big Data Report September
NEW: TrendConnect: Big Data Report SeptemberNEW: TrendConnect: Big Data Report September
NEW: TrendConnect: Big Data Report SeptemberSiliconANGLE Pro
 
Open Semantic Service Networks
Open Semantic Service NetworksOpen Semantic Service Networks
Open Semantic Service NetworksJorge Cardoso
 
Progress with confidence into next generation IT
Progress with confidence into next generation ITProgress with confidence into next generation IT
Progress with confidence into next generation ITPaul Muller
 

La actualidad más candente (17)

Data Center Decisions: Build Versus Buy
Data Center Decisions: Build Versus BuyData Center Decisions: Build Versus Buy
Data Center Decisions: Build Versus Buy
 
Big Data and Content Management. SkyDox and the European Court of Human Righ...
Big Data and Content Management.  SkyDox and the European Court of Human Righ...Big Data and Content Management.  SkyDox and the European Court of Human Righ...
Big Data and Content Management. SkyDox and the European Court of Human Righ...
 
Next Generation IT
Next Generation ITNext Generation IT
Next Generation IT
 
101 ab 1415-1445
101 ab 1415-1445101 ab 1415-1445
101 ab 1415-1445
 
Infrastructure software 2011 2012
Infrastructure software 2011 2012Infrastructure software 2011 2012
Infrastructure software 2011 2012
 
15h00 intel - intel big data for aws summits rev3
15h00   intel - intel big data for aws summits rev315h00   intel - intel big data for aws summits rev3
15h00 intel - intel big data for aws summits rev3
 
Durnat law numbers
Durnat law numbersDurnat law numbers
Durnat law numbers
 
Intel and Big Data
Intel and Big DataIntel and Big Data
Intel and Big Data
 
Enterprise Energy Management using a Linked Dataspace for Energy Intelligence
Enterprise Energy Management using a Linked Dataspace for Energy IntelligenceEnterprise Energy Management using a Linked Dataspace for Energy Intelligence
Enterprise Energy Management using a Linked Dataspace for Energy Intelligence
 
Cloud provider transparency
Cloud provider transparencyCloud provider transparency
Cloud provider transparency
 
Challenges for Open Semantic Service Networks : models, theory, applications
Challenges for Open Semantic Service Networks: models, theory, applications Challenges for Open Semantic Service Networks: models, theory, applications
Challenges for Open Semantic Service Networks : models, theory, applications
 
Greenplum hadoop
Greenplum hadoopGreenplum hadoop
Greenplum hadoop
 
Rob anderson
Rob andersonRob anderson
Rob anderson
 
NEW: TrendConnect: Big Data Report September
NEW: TrendConnect: Big Data Report SeptemberNEW: TrendConnect: Big Data Report September
NEW: TrendConnect: Big Data Report September
 
Document Capture Technologies (OTCBB: DCMT)
Document Capture Technologies (OTCBB: DCMT) Document Capture Technologies (OTCBB: DCMT)
Document Capture Technologies (OTCBB: DCMT)
 
Open Semantic Service Networks
Open Semantic Service NetworksOpen Semantic Service Networks
Open Semantic Service Networks
 
Progress with confidence into next generation IT
Progress with confidence into next generation ITProgress with confidence into next generation IT
Progress with confidence into next generation IT
 

Destacado

AMIA Johan Oomen Final
AMIA Johan Oomen FinalAMIA Johan Oomen Final
AMIA Johan Oomen FinalJohan Oomen
 
Met kennisbeheer op weg naar service excellence - SEE 2016
Met kennisbeheer op weg naar service excellence - SEE 2016Met kennisbeheer op weg naar service excellence - SEE 2016
Met kennisbeheer op weg naar service excellence - SEE 2016TOPdesk
 
Inside Papyrus Webrepository - Technology Innovation Brochure by ISIS Papyrus...
Inside Papyrus Webrepository - Technology Innovation Brochure by ISIS Papyrus...Inside Papyrus Webrepository - Technology Innovation Brochure by ISIS Papyrus...
Inside Papyrus Webrepository - Technology Innovation Brochure by ISIS Papyrus...ISIS Papyrus Software
 
David Bianco - Enterprise Security Monitoring
David Bianco - Enterprise Security MonitoringDavid Bianco - Enterprise Security Monitoring
David Bianco - Enterprise Security Monitoringbsidesaugusta
 
TOPdesk, SEE what's new - SEE 2016
TOPdesk, SEE what's new - SEE 2016TOPdesk, SEE what's new - SEE 2016
TOPdesk, SEE what's new - SEE 2016TOPdesk
 
Het servicedesk HR binnen Philadelphia - SEE 2016
Het servicedesk HR binnen Philadelphia - SEE 2016Het servicedesk HR binnen Philadelphia - SEE 2016
Het servicedesk HR binnen Philadelphia - SEE 2016TOPdesk
 
EuropeanaTECH Conference ~ Distributed Community Empowerment
EuropeanaTECH Conference ~ Distributed Community EmpowermentEuropeanaTECH Conference ~ Distributed Community Empowerment
EuropeanaTECH Conference ~ Distributed Community EmpowermentJohan Oomen
 
SAMT 2009 Johan Oomen
SAMT 2009 Johan OomenSAMT 2009 Johan Oomen
SAMT 2009 Johan OomenJohan Oomen
 
Escape the Complexity - Technology Innovation Brochure by ISIS Papyrus Software
Escape the Complexity - Technology Innovation Brochure by ISIS Papyrus SoftwareEscape the Complexity - Technology Innovation Brochure by ISIS Papyrus Software
Escape the Complexity - Technology Innovation Brochure by ISIS Papyrus SoftwareISIS Papyrus Software
 
Spider man
Spider manSpider man
Spider man144103
 
Met kennisbeheer naar service excellence op de supportafdeling van TOPdesk - ...
Met kennisbeheer naar service excellence op de supportafdeling van TOPdesk - ...Met kennisbeheer naar service excellence op de supportafdeling van TOPdesk - ...
Met kennisbeheer naar service excellence op de supportafdeling van TOPdesk - ...TOPdesk
 
Addressing the cyber kill chain
Addressing the cyber kill chainAddressing the cyber kill chain
Addressing the cyber kill chainSymantec Brasil
 
Process.gov - Elements of Adaptive Case Management
Process.gov - Elements of Adaptive Case ManagementProcess.gov - Elements of Adaptive Case Management
Process.gov - Elements of Adaptive Case Managementmjpucher
 
Canterbury Tales Review
Canterbury Tales ReviewCanterbury Tales Review
Canterbury Tales ReviewFranklin Local
 
Infographic – Sales Growth: Five proven strategies
Infographic – Sales Growth: Five proven strategiesInfographic – Sales Growth: Five proven strategies
Infographic – Sales Growth: Five proven strategiesMcKinsey on Marketing & Sales
 

Destacado (18)

AMIA Johan Oomen Final
AMIA Johan Oomen FinalAMIA Johan Oomen Final
AMIA Johan Oomen Final
 
Met kennisbeheer op weg naar service excellence - SEE 2016
Met kennisbeheer op weg naar service excellence - SEE 2016Met kennisbeheer op weg naar service excellence - SEE 2016
Met kennisbeheer op weg naar service excellence - SEE 2016
 
Inside Papyrus Webrepository - Technology Innovation Brochure by ISIS Papyrus...
Inside Papyrus Webrepository - Technology Innovation Brochure by ISIS Papyrus...Inside Papyrus Webrepository - Technology Innovation Brochure by ISIS Papyrus...
Inside Papyrus Webrepository - Technology Innovation Brochure by ISIS Papyrus...
 
David Bianco - Enterprise Security Monitoring
David Bianco - Enterprise Security MonitoringDavid Bianco - Enterprise Security Monitoring
David Bianco - Enterprise Security Monitoring
 
Doc1
Doc1Doc1
Doc1
 
TOPdesk, SEE what's new - SEE 2016
TOPdesk, SEE what's new - SEE 2016TOPdesk, SEE what's new - SEE 2016
TOPdesk, SEE what's new - SEE 2016
 
Het servicedesk HR binnen Philadelphia - SEE 2016
Het servicedesk HR binnen Philadelphia - SEE 2016Het servicedesk HR binnen Philadelphia - SEE 2016
Het servicedesk HR binnen Philadelphia - SEE 2016
 
EuropeanaTECH Conference ~ Distributed Community Empowerment
EuropeanaTECH Conference ~ Distributed Community EmpowermentEuropeanaTECH Conference ~ Distributed Community Empowerment
EuropeanaTECH Conference ~ Distributed Community Empowerment
 
SAMT 2009 Johan Oomen
SAMT 2009 Johan OomenSAMT 2009 Johan Oomen
SAMT 2009 Johan Oomen
 
Escape the Complexity - Technology Innovation Brochure by ISIS Papyrus Software
Escape the Complexity - Technology Innovation Brochure by ISIS Papyrus SoftwareEscape the Complexity - Technology Innovation Brochure by ISIS Papyrus Software
Escape the Complexity - Technology Innovation Brochure by ISIS Papyrus Software
 
Protection Equipment in a Power Station
Protection Equipment in a Power StationProtection Equipment in a Power Station
Protection Equipment in a Power Station
 
Spider man
Spider manSpider man
Spider man
 
Met kennisbeheer naar service excellence op de supportafdeling van TOPdesk - ...
Met kennisbeheer naar service excellence op de supportafdeling van TOPdesk - ...Met kennisbeheer naar service excellence op de supportafdeling van TOPdesk - ...
Met kennisbeheer naar service excellence op de supportafdeling van TOPdesk - ...
 
งานบวชปากเซ
งานบวชปากเซงานบวชปากเซ
งานบวชปากเซ
 
Addressing the cyber kill chain
Addressing the cyber kill chainAddressing the cyber kill chain
Addressing the cyber kill chain
 
Process.gov - Elements of Adaptive Case Management
Process.gov - Elements of Adaptive Case ManagementProcess.gov - Elements of Adaptive Case Management
Process.gov - Elements of Adaptive Case Management
 
Canterbury Tales Review
Canterbury Tales ReviewCanterbury Tales Review
Canterbury Tales Review
 
Infographic – Sales Growth: Five proven strategies
Infographic – Sales Growth: Five proven strategiesInfographic – Sales Growth: Five proven strategies
Infographic – Sales Growth: Five proven strategies
 

Similar a Big Data and Society: Understanding the Opportunities and Challenges

Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Mark Heid
 
Intel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick KnupfferIntel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick KnupfferIntelAPAC
 
Robert LeBlanc - Why Big Data? Why Now?
Robert LeBlanc - Why Big Data? Why Now?Robert LeBlanc - Why Big Data? Why Now?
Robert LeBlanc - Why Big Data? Why Now?Mauricio Godoy
 
What is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use CasesWhat is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use CasesTony Pearson
 
Cbs essay comp smarter planet
Cbs essay comp smarter planetCbs essay comp smarter planet
Cbs essay comp smarter planetAnders Quitzau
 
Intel Cloud Summit: Big Data
Intel Cloud Summit: Big DataIntel Cloud Summit: Big Data
Intel Cloud Summit: Big DataIntelAPAC
 
Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing DataWorks Summit
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forumbigdatawf
 
How to Crunch Petabytes with Hadoop and Big Data using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data using InfoSphere BigInsights...How to Crunch Petabytes with Hadoop and Big Data using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data using InfoSphere BigInsights...Vladimir Bacvanski, PhD
 
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...DATAVERSITY
 
Unlocking value in your (big) data
Unlocking value in your (big) dataUnlocking value in your (big) data
Unlocking value in your (big) dataOscar Renalias
 
Big data cloud cloud circle keynote_final laura colvine 8th november 2012
Big data cloud cloud circle keynote_final laura colvine 8th november 2012Big data cloud cloud circle keynote_final laura colvine 8th november 2012
Big data cloud cloud circle keynote_final laura colvine 8th november 2012IBM
 
CII Panel Discussion on Cloud Computing
CII Panel Discussion on Cloud ComputingCII Panel Discussion on Cloud Computing
CII Panel Discussion on Cloud ComputingAnand Deshpande
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)Ajay Ohri
 
IBM CEC Big Data 2011 06-11 final
IBM CEC Big Data 2011 06-11 finalIBM CEC Big Data 2011 06-11 final
IBM CEC Big Data 2011 06-11 finalCOMMON Europe
 
Big Data = Big Decisions
Big Data = Big DecisionsBig Data = Big Decisions
Big Data = Big DecisionsInnoTech
 
Smarter planet and mega trends presentation 2012
Smarter planet and mega trends presentation 2012Smarter planet and mega trends presentation 2012
Smarter planet and mega trends presentation 2012Joergen Floes
 
OSC2012: Big Data Using Open Source: Netapp Project - Technical
OSC2012: Big Data Using Open Source: Netapp Project - TechnicalOSC2012: Big Data Using Open Source: Netapp Project - Technical
OSC2012: Big Data Using Open Source: Netapp Project - TechnicalAccenture the Netherlands
 

Similar a Big Data and Society: Understanding the Opportunities and Challenges (20)

Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
 
Accelerate Return on Data
Accelerate Return on DataAccelerate Return on Data
Accelerate Return on Data
 
Intel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick KnupfferIntel Cloud summit: Big Data by Nick Knupffer
Intel Cloud summit: Big Data by Nick Knupffer
 
Robert LeBlanc - Why Big Data? Why Now?
Robert LeBlanc - Why Big Data? Why Now?Robert LeBlanc - Why Big Data? Why Now?
Robert LeBlanc - Why Big Data? Why Now?
 
What is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use CasesWhat is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use Cases
 
Cbs essay comp smarter planet
Cbs essay comp smarter planetCbs essay comp smarter planet
Cbs essay comp smarter planet
 
Intel Cloud Summit: Big Data
Intel Cloud Summit: Big DataIntel Cloud Summit: Big Data
Intel Cloud Summit: Big Data
 
Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forum
 
How to Crunch Petabytes with Hadoop and Big Data using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data using InfoSphere BigInsights...How to Crunch Petabytes with Hadoop and Big Data using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data using InfoSphere BigInsights...
 
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
How to Crunch Petabytes with Hadoop and Big Data Using InfoSphere BigInsights...
 
Unlocking value in your (big) data
Unlocking value in your (big) dataUnlocking value in your (big) data
Unlocking value in your (big) data
 
Big data cloud cloud circle keynote_final laura colvine 8th november 2012
Big data cloud cloud circle keynote_final laura colvine 8th november 2012Big data cloud cloud circle keynote_final laura colvine 8th november 2012
Big data cloud cloud circle keynote_final laura colvine 8th november 2012
 
CII Panel Discussion on Cloud Computing
CII Panel Discussion on Cloud ComputingCII Panel Discussion on Cloud Computing
CII Panel Discussion on Cloud Computing
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
 
101 ab 1415-1445
101 ab 1415-1445101 ab 1415-1445
101 ab 1415-1445
 
IBM CEC Big Data 2011 06-11 final
IBM CEC Big Data 2011 06-11 finalIBM CEC Big Data 2011 06-11 final
IBM CEC Big Data 2011 06-11 final
 
Big Data = Big Decisions
Big Data = Big DecisionsBig Data = Big Decisions
Big Data = Big Decisions
 
Smarter planet and mega trends presentation 2012
Smarter planet and mega trends presentation 2012Smarter planet and mega trends presentation 2012
Smarter planet and mega trends presentation 2012
 
OSC2012: Big Data Using Open Source: Netapp Project - Technical
OSC2012: Big Data Using Open Source: Netapp Project - TechnicalOSC2012: Big Data Using Open Source: Netapp Project - Technical
OSC2012: Big Data Using Open Source: Netapp Project - Technical
 

Más de Media Perspectives

Presentatie Paul Rutten - Monitor Creatieve Industrie 2021
Presentatie Paul Rutten - Monitor Creatieve Industrie 2021Presentatie Paul Rutten - Monitor Creatieve Industrie 2021
Presentatie Paul Rutten - Monitor Creatieve Industrie 2021Media Perspectives
 
Jeroen Broekema (Springcast) - Podcast hosting en analytics
Jeroen Broekema (Springcast) - Podcast hosting en analyticsJeroen Broekema (Springcast) - Podcast hosting en analytics
Jeroen Broekema (Springcast) - Podcast hosting en analyticsMedia Perspectives
 
Liedewij Hentenaar (Audify) over de groei van audio
Liedewij Hentenaar (Audify) over de groei van audioLiedewij Hentenaar (Audify) over de groei van audio
Liedewij Hentenaar (Audify) over de groei van audioMedia Perspectives
 
Egon Verhagen (NPO) - Audio innovatie bij de publieke omroep
Egon Verhagen (NPO) - Audio innovatie bij de publieke omroepEgon Verhagen (NPO) - Audio innovatie bij de publieke omroep
Egon Verhagen (NPO) - Audio innovatie bij de publieke omroepMedia Perspectives
 
Willem Brom (EndemolShine) over non-scripted voor streamers
Willem Brom (EndemolShine) over non-scripted voor streamersWillem Brom (EndemolShine) over non-scripted voor streamers
Willem Brom (EndemolShine) over non-scripted voor streamersMedia Perspectives
 
Jordi van de Bovenkamp (MediaMonks) met vijf tips voor fit-for-format-content
Jordi van de Bovenkamp (MediaMonks) met vijf tips voor fit-for-format-contentJordi van de Bovenkamp (MediaMonks) met vijf tips voor fit-for-format-content
Jordi van de Bovenkamp (MediaMonks) met vijf tips voor fit-for-format-contentMedia Perspectives
 
Laura Veenema (NewBe) over 'superserve the niche'
Laura Veenema (NewBe) over 'superserve the niche'Laura Veenema (NewBe) over 'superserve the niche'
Laura Veenema (NewBe) over 'superserve the niche'Media Perspectives
 
Gerard de Kloet (NOS) over @NOS op Instagram
Gerard de Kloet (NOS) over @NOS op Instagram Gerard de Kloet (NOS) over @NOS op Instagram
Gerard de Kloet (NOS) over @NOS op Instagram Media Perspectives
 
Paulo Lopes Escudeiro over nieuwe TikTok-gewoontes @ Cross Media Café - Nieuw...
Paulo Lopes Escudeiro over nieuwe TikTok-gewoontes @ Cross Media Café - Nieuw...Paulo Lopes Escudeiro over nieuwe TikTok-gewoontes @ Cross Media Café - Nieuw...
Paulo Lopes Escudeiro over nieuwe TikTok-gewoontes @ Cross Media Café - Nieuw...Media Perspectives
 
Slides MediaTalk NOS-project '75 jaar bevrijding'
Slides MediaTalk NOS-project '75 jaar bevrijding'Slides MediaTalk NOS-project '75 jaar bevrijding'
Slides MediaTalk NOS-project '75 jaar bevrijding'Media Perspectives
 
Paul Bojarski (Sceenic) over Watch Together @ CMC - Innovatie in coronatijden
Paul Bojarski (Sceenic) over Watch Together @ CMC - Innovatie in coronatijdenPaul Bojarski (Sceenic) over Watch Together @ CMC - Innovatie in coronatijden
Paul Bojarski (Sceenic) over Watch Together @ CMC - Innovatie in coronatijdenMedia Perspectives
 
Tomas van den Spiegel (Flanders Classics) en Jorre Belpaire (Kiswe Mobile) ov...
Tomas van den Spiegel (Flanders Classics) en Jorre Belpaire (Kiswe Mobile) ov...Tomas van den Spiegel (Flanders Classics) en Jorre Belpaire (Kiswe Mobile) ov...
Tomas van den Spiegel (Flanders Classics) en Jorre Belpaire (Kiswe Mobile) ov...Media Perspectives
 
Geraldine Macqueron (GAME OVER) over het initiatief Creators United @ CMC - I...
Geraldine Macqueron (GAME OVER) over het initiatief Creators United @ CMC - I...Geraldine Macqueron (GAME OVER) over het initiatief Creators United @ CMC - I...
Geraldine Macqueron (GAME OVER) over het initiatief Creators United @ CMC - I...Media Perspectives
 
Arno Scharl (webLyzard technology) over online corona sentimenten weergeeft @...
Arno Scharl (webLyzard technology) over online corona sentimenten weergeeft @...Arno Scharl (webLyzard technology) over online corona sentimenten weergeeft @...
Arno Scharl (webLyzard technology) over online corona sentimenten weergeeft @...Media Perspectives
 
William Linders (ODMedia) over de opkomst van SVOD en AVOD
William Linders (ODMedia) over de opkomst van SVOD en AVODWilliam Linders (ODMedia) over de opkomst van SVOD en AVOD
William Linders (ODMedia) over de opkomst van SVOD en AVODMedia Perspectives
 
Suzan Hoogland (GfK) over hoe de Nederlander 'Video' consumeert
Suzan Hoogland (GfK) over hoe de Nederlander 'Video' consumeertSuzan Hoogland (GfK) over hoe de Nederlander 'Video' consumeert
Suzan Hoogland (GfK) over hoe de Nederlander 'Video' consumeertMedia Perspectives
 
Maarten Lens-FitzGerald (voice ondernemers) @ CMC Nieuwe Interfaces
Maarten Lens-FitzGerald (voice ondernemers) @ CMC Nieuwe Interfaces Maarten Lens-FitzGerald (voice ondernemers) @ CMC Nieuwe Interfaces
Maarten Lens-FitzGerald (voice ondernemers) @ CMC Nieuwe Interfaces Media Perspectives
 
Jeroen de Bakker (Talpa Network) @ CMC Nieuwe Interfaces
Jeroen de Bakker (Talpa Network) @ CMC Nieuwe InterfacesJeroen de Bakker (Talpa Network) @ CMC Nieuwe Interfaces
Jeroen de Bakker (Talpa Network) @ CMC Nieuwe InterfacesMedia Perspectives
 
Vera Holland (KRO-NCRV) @ CMC Nieuwe Interfaces
Vera Holland (KRO-NCRV) @ CMC Nieuwe InterfacesVera Holland (KRO-NCRV) @ CMC Nieuwe Interfaces
Vera Holland (KRO-NCRV) @ CMC Nieuwe InterfacesMedia Perspectives
 
Joey Scheufler (Prappers Media) @ CMC Nieuwe Interfaces
Joey Scheufler (Prappers Media) @ CMC Nieuwe InterfacesJoey Scheufler (Prappers Media) @ CMC Nieuwe Interfaces
Joey Scheufler (Prappers Media) @ CMC Nieuwe InterfacesMedia Perspectives
 

Más de Media Perspectives (20)

Presentatie Paul Rutten - Monitor Creatieve Industrie 2021
Presentatie Paul Rutten - Monitor Creatieve Industrie 2021Presentatie Paul Rutten - Monitor Creatieve Industrie 2021
Presentatie Paul Rutten - Monitor Creatieve Industrie 2021
 
Jeroen Broekema (Springcast) - Podcast hosting en analytics
Jeroen Broekema (Springcast) - Podcast hosting en analyticsJeroen Broekema (Springcast) - Podcast hosting en analytics
Jeroen Broekema (Springcast) - Podcast hosting en analytics
 
Liedewij Hentenaar (Audify) over de groei van audio
Liedewij Hentenaar (Audify) over de groei van audioLiedewij Hentenaar (Audify) over de groei van audio
Liedewij Hentenaar (Audify) over de groei van audio
 
Egon Verhagen (NPO) - Audio innovatie bij de publieke omroep
Egon Verhagen (NPO) - Audio innovatie bij de publieke omroepEgon Verhagen (NPO) - Audio innovatie bij de publieke omroep
Egon Verhagen (NPO) - Audio innovatie bij de publieke omroep
 
Willem Brom (EndemolShine) over non-scripted voor streamers
Willem Brom (EndemolShine) over non-scripted voor streamersWillem Brom (EndemolShine) over non-scripted voor streamers
Willem Brom (EndemolShine) over non-scripted voor streamers
 
Jordi van de Bovenkamp (MediaMonks) met vijf tips voor fit-for-format-content
Jordi van de Bovenkamp (MediaMonks) met vijf tips voor fit-for-format-contentJordi van de Bovenkamp (MediaMonks) met vijf tips voor fit-for-format-content
Jordi van de Bovenkamp (MediaMonks) met vijf tips voor fit-for-format-content
 
Laura Veenema (NewBe) over 'superserve the niche'
Laura Veenema (NewBe) over 'superserve the niche'Laura Veenema (NewBe) over 'superserve the niche'
Laura Veenema (NewBe) over 'superserve the niche'
 
Gerard de Kloet (NOS) over @NOS op Instagram
Gerard de Kloet (NOS) over @NOS op Instagram Gerard de Kloet (NOS) over @NOS op Instagram
Gerard de Kloet (NOS) over @NOS op Instagram
 
Paulo Lopes Escudeiro over nieuwe TikTok-gewoontes @ Cross Media Café - Nieuw...
Paulo Lopes Escudeiro over nieuwe TikTok-gewoontes @ Cross Media Café - Nieuw...Paulo Lopes Escudeiro over nieuwe TikTok-gewoontes @ Cross Media Café - Nieuw...
Paulo Lopes Escudeiro over nieuwe TikTok-gewoontes @ Cross Media Café - Nieuw...
 
Slides MediaTalk NOS-project '75 jaar bevrijding'
Slides MediaTalk NOS-project '75 jaar bevrijding'Slides MediaTalk NOS-project '75 jaar bevrijding'
Slides MediaTalk NOS-project '75 jaar bevrijding'
 
Paul Bojarski (Sceenic) over Watch Together @ CMC - Innovatie in coronatijden
Paul Bojarski (Sceenic) over Watch Together @ CMC - Innovatie in coronatijdenPaul Bojarski (Sceenic) over Watch Together @ CMC - Innovatie in coronatijden
Paul Bojarski (Sceenic) over Watch Together @ CMC - Innovatie in coronatijden
 
Tomas van den Spiegel (Flanders Classics) en Jorre Belpaire (Kiswe Mobile) ov...
Tomas van den Spiegel (Flanders Classics) en Jorre Belpaire (Kiswe Mobile) ov...Tomas van den Spiegel (Flanders Classics) en Jorre Belpaire (Kiswe Mobile) ov...
Tomas van den Spiegel (Flanders Classics) en Jorre Belpaire (Kiswe Mobile) ov...
 
Geraldine Macqueron (GAME OVER) over het initiatief Creators United @ CMC - I...
Geraldine Macqueron (GAME OVER) over het initiatief Creators United @ CMC - I...Geraldine Macqueron (GAME OVER) over het initiatief Creators United @ CMC - I...
Geraldine Macqueron (GAME OVER) over het initiatief Creators United @ CMC - I...
 
Arno Scharl (webLyzard technology) over online corona sentimenten weergeeft @...
Arno Scharl (webLyzard technology) over online corona sentimenten weergeeft @...Arno Scharl (webLyzard technology) over online corona sentimenten weergeeft @...
Arno Scharl (webLyzard technology) over online corona sentimenten weergeeft @...
 
William Linders (ODMedia) over de opkomst van SVOD en AVOD
William Linders (ODMedia) over de opkomst van SVOD en AVODWilliam Linders (ODMedia) over de opkomst van SVOD en AVOD
William Linders (ODMedia) over de opkomst van SVOD en AVOD
 
Suzan Hoogland (GfK) over hoe de Nederlander 'Video' consumeert
Suzan Hoogland (GfK) over hoe de Nederlander 'Video' consumeertSuzan Hoogland (GfK) over hoe de Nederlander 'Video' consumeert
Suzan Hoogland (GfK) over hoe de Nederlander 'Video' consumeert
 
Maarten Lens-FitzGerald (voice ondernemers) @ CMC Nieuwe Interfaces
Maarten Lens-FitzGerald (voice ondernemers) @ CMC Nieuwe Interfaces Maarten Lens-FitzGerald (voice ondernemers) @ CMC Nieuwe Interfaces
Maarten Lens-FitzGerald (voice ondernemers) @ CMC Nieuwe Interfaces
 
Jeroen de Bakker (Talpa Network) @ CMC Nieuwe Interfaces
Jeroen de Bakker (Talpa Network) @ CMC Nieuwe InterfacesJeroen de Bakker (Talpa Network) @ CMC Nieuwe Interfaces
Jeroen de Bakker (Talpa Network) @ CMC Nieuwe Interfaces
 
Vera Holland (KRO-NCRV) @ CMC Nieuwe Interfaces
Vera Holland (KRO-NCRV) @ CMC Nieuwe InterfacesVera Holland (KRO-NCRV) @ CMC Nieuwe Interfaces
Vera Holland (KRO-NCRV) @ CMC Nieuwe Interfaces
 
Joey Scheufler (Prappers Media) @ CMC Nieuwe Interfaces
Joey Scheufler (Prappers Media) @ CMC Nieuwe InterfacesJoey Scheufler (Prappers Media) @ CMC Nieuwe Interfaces
Joey Scheufler (Prappers Media) @ CMC Nieuwe Interfaces
 

Último

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Último (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

Big Data and Society: Understanding the Opportunities and Challenges

  • 1. Big Data and Society Dirk deRoos IBM World Wide Technical Sales, IBM Big Data Platform dderoos@ca.ibm.com @Dirk_deRoos © 2012 IBM Corporation
  • 2. Agenda • What is Big Data? • What will our ability to analyze Big Data lead to? • What should we do to prepare? 2 © 2012 IBM Corporation
  • 3. What is Big Data? Stories and Definitions © 2012 IBM Corporation
  • 4. Harnessing the Largest Predictive Focus Group in the World Purpose • Understand public sentiment towards an event: movie trailers • Deeply understand the potential customer profile: gender, occupation, intent to watch • Alter marketing launch plans based on insight Background • 1.1 Billion Tweets analyzed • 5.7 Million blogs/forum posts • 3.5 million messages • Also: Facebook, Google+, Tumblr, Flickr 4 © 2012 IBM Corporation
  • 5. Media & Entertainment Social Media Analytics 5 © 2012 IBM Corporation
  • 6. Asian telco reduces billing costs and improves customer satisfaction Real-time mediation and analysis of 6B CDRs per day Data processing time reduced from 12 hrs to 1 sec Hardware cost reduced to 1/8th Proactively address issues (e.g. dropped calls) impacting customer satisfaction 6 © 2012 IBM Corporation
  • 7. Pacific Northwest Smart Grid Demonstration Project Capabilities: Stream Computing – real-time control system Deep Analytics Appliance – analyze massive data sets Demonstrates scalability from 100 to 500K homes while retaining 10 years’ historical data Accommodates ad hoc analysis of price fluctuation, energy consumption profiles, risk, fraud detection, grid health, etc. 7 © 2012 IBM Corporation
  • 8. Watson’s advanced analytic capabilities can sort through the equivalent of 8 200 MILLION pages of data to uncover an answer in 3 SECONDS.Corporation © 2012 IBM
  • 9. The Jeopardy! Challenge – Question Answering Solution Broad/Open $200 $1000 Domain If you're standing, it's The first person the direction you should mentioned by name in ‘The Man in the Iron look to check out the Complex wainscoting. Mask’ is this hero of a previous book by the Language same author. High Precision $2000 $600 Of the 4 countries in the In cell division, mitosis Accurate world that the U.S. does splits the nucleus & not have diplomatic Confidence cytokinesis splits this relations with, the one liquid cushioning the that’s farthest north nucleus High Speed 9 © 2012 IBM Corporation
  • 10. Brief History of IBM Watson IBM Jeopardy! Watson Watson Watson Research Grand for for Financial Industry Project Challenge Healthcare Services Solutions (2006 – ) (Feb 2011) (Aug 2011 –) (Mar 2012 – ) (2012 – ) Cross-industry Expansion Applications Commercialization Demonstration From inspiration and invention, through innovation and R&D industrialization, ending with industry transformation. 10 © 2012 IBM Corporation
  • 11. IBM Watson brings together a set of transformational technologies to drive optimized outcomes 2 Generates and evaluates hypothesis for 1 Understands better outcomes natural 99% language and 60% 10% human speech 3 Adapts and Learns from …built on a massively parallel user selections probabilistic evidence-based and responses architecture optimized for POWER7 11 © 2012 IBM Corporation
  • 12. Healthcare industry is beset with some of the most complex information challenges we collectively face Medical information is doubling every 5 years, much of which is unstructured 81% of physicians report spending 5 hours or less per month reading medical journals “Medicine has become too complex (and only) about 20% of the knowledge clinicians use today is evidence-base.” Steven Shapiro, Chief Medical & Scientific Officer, UPMC 12 Source: International Journal of Circumpolar Health, © 2012 IBM Corporation DoctorDirectory.com, Institute for Medicine"
  • 13. IBM and Seton Health put Ready for Watson to work • Proactively target care management that reduces re-admission of congestive heart failure patients • Improve patient quality of life, reduce cost and mortality rates • Analyze unstructured data (e.g., physician notes) and provide an integrated view of clinical and operational information • Analysis revealed: – 18 top indicators determined – 2 key re-admission factors 13 © 2012 IBM Corporation
  • 14. Traditional Approach New Approach Structured, analytical, logical Creative, holistic thought, intuition Data Hadoop Warehouse Streams Web Logs Transaction Data Social Data Internal App Data Structured Unstructured Enterprise Repeatable Integration Exploratory Text Data: emails Mainframe Data Linear Iterative Sensor data: images ERP data Traditional New RFID Sources Sources 14 © 2012 IBM Corporation
  • 15. 1TB = 1000GB = 108 DVD movies 1PB = 1000TB = 108,000 DVD movies 4.6 30 billion RFID billion tags today camera 12+ TBs (1.3B in 2005) phones of tweet data world every day wide 100s of millions of GPS data every day ? TBs of enabled devices sold annually 25+ TBs of 2+ log data billion every day people on the 76 million smart Web by meters in 2009… end 2011 200M by 2014 15 © 2012 IBM Corporation
  • 16. The Big Data Opportunity Extracting insight from an immense volume, variety and velocity of data, in context, beyond what was previously possible. 16 © 2012 IBM Corporation
  • 17. What can you do with Big Data? Financial Services Utilities • Fraud detection • Weather analysis • 360° View of the Customer • Smart grid management Transportation IT • Logistics optimization • System Log Analysis • Traffic congestion • Cybersecurity Health & Life Sciences Retail • Epidemic early warning • 360° View of the • ICU monitoring Customer • Real-time promotions Telecommunications Law Enforcement • Geomapping / marketing • Multimodal surveillance • Network monitoring • Cyber security detection 17 © 2012 IBM Corporation
  • 18. Why Didn’t We Use All of the Big Data Before? 18 © 2012 IBM Corporation
  • 19. What will our ability to Analyze Big Data lead to? Changes in Media, Changes in Mentality © 2012 IBM Corporation
  • 20. Rise of the Data Scientist Role in Organizations • What is a Data Scientist? •Statistics expert •Text analytics expert •Data integration expert •Someone who specializes in exploring data in new ways, looking for value • This will become a normal role in organizations 20 © 2012 IBM Corporation
  • 21. More Information will be Accessible to More People • Large organizations will not be able to control the world’s data •Not about WikiLeaks •Many information gathering sources • More People will be able to Perform Deep Analysis •Data Scientist skills and tools will increasingly be available at a lower cost • Cloud technology • Lower cost, higher capacity hardware 21 © 2012 IBM Corporation
  • 22. Analytics has Evolved from Business Initiative to Business Imperative Analytically sophisticated companies outperform their competition Respondents who say analytics Organizations achieving creates a competitive advantage a competitive advantage with analytics are 2010 37% 57% increase 2.2x more likely to substantially outperform 2011 58% their industry peers Source: The New Intelligent Enterprise, a joint MIT Sloan Management Review and IBM Institute of Business Value analytics 22 research partnership. Copyright © Massachusetts Institute of Technology 2011 © 2012 IBM Corporation
  • 23. Accelerated Pace of Change 23 © 2012 IBM Corporation
  • 24. Volatility in Media Usage 24 © 2012 IBM Corporation
  • 25. Changes in Societal View of Privacy • Privacy is a social construction •Changed through history • Growing acceptance of individual and aggregate monitoring •69% of Americans store personal data in the cloud. (Pew Internet, 2008) •85% of Americans own a cell phone. (Pew Internet, 2011) 25 Source: Pew Internet, 2008, 2011) © 2012 IBM Corporation
  • 26. What Should We Do? Watch, Think, Legislate © 2012 IBM Corporation
  • 27. Quest for Governance of Big Data • Businesses seek maximum capitalization of Big Data •Retail • Increased tracking of customer behavior •Telco/Internet • Individual profiling based on personal browsing or calling history • Rights restrictions for digital media – pay per use • Governments seek maximum well-being of society through Big Data •Security • Everything can be monitored •Health care • Centralize all records • Genomic data • Individuals seek protection from having their data used against them 27 © 2012 IBM Corporation
  • 28. Impact of Technological Change is Unpredictable 28 © 2012 IBM Corporation
  • 29. As Changes Occur, Society Must React • What kind of world do we want? •Opportunity for business profits •Greater good of society •Rights of individual / freedom • Constant forces •Markets will always strive for efficiency •Technological changes are always coming • Balance priorities 29 © 2012 IBM Corporation
  • 30. 30 © 2012 IBM Corporation

Notas del editor

  1. We have been working with an Indian telco client for some time now to help reduce their billing costs and improve customer satisfaction. Challenge: Call Detail Record (CDR) processing within their data warehouse was sub-optimal, Could not achieve real time billing which required handling billions of CDRs per day and de-duplication against 15 days worth of CDR data Unable to support for future IT and Business with real-time analytics Solution: Single platform for mediation and real time analytics reduces IT complexity The PMML standard is used to import data mining models from InfoSphere Warehouse. Offloaded the CDR processing to InfoSphere Streams resulting in enhanced data warehouse performance and improved TCO Each incoming CDR is analyzed using these data mining models, allowing immediate detection of events (ex: dropped calls) that might create customer satisfaction issues. Business Benefit: Data now processed at the speed of Business - from 12 hours to 1 second HW Costs reduced to 1/8th Support for future growth without the need to re-architect, more data, more analysis Platform in-place for real-time analytics to drive revenue
  2. IBM has been working with one of the leading non-profit research institutes leading a regional project to prove the viability and benefits of smart grid technology and test the concept of demand-based electrical power pricing Background: The project is the largest initiative of its kind in the US and is designed to test and quantify smart grid costs and benefits with over 60,000 consumers in five states - Washington, Oregon, Idaho, Montana and Wyoming. The smart grid technique uses an incentive and a feedback signal to help coordinate smart grid resources. The two-way communication of this information - from power source to destination - allows intelligent devices and consumers to make smart decisions about using this energy. The requirements of the project call for a robust infrastructure that facilitates two-way data flow and computing power capable of continuously processing petabytes of data. Solution: IBM is building the infrastructure to disseminate the project ’ s transactive incentive signal and interlace it with the participants ’ responsive assets. The solution consists of: - IBM streams computing software running on IBM x86 servers to allow for the effective streaming of data - IBM data warehouse appliance provide to analyze and understand the project data (up to 10 petabytes) in minutes Benefits: • Enabled a town to avoid a power outage by using a two-way advanced meter system to shut off home water heaters during peak periods, reducing strain on an unreliable underwater cable • Empowers consumers to make educated choices about how and when to use electricity, and at what price • Increases grid efficiency and reliability through system self-monitoring and feedback
  3. Most of you know of Watson, our computing system designed to compete on the Jeopardy game show. Watson represents a breakthrough in terms of volume of information stored, and the ability to access it quickly (answering natural language questions). I think Watson is impressive, because there are many commercial uses for this technology – and the technology exists today! The game Jeopardy provides the ultimate challenge for Watson because the game’s clues involve analyzing subtle meanings, irony, riddles, and other complexities in which humans excel and computers traditionally do not. If you think about Deep Blue, the 1997 IBM machine that defeated the reigning world chess champion, Watson is yet another major leap in capability of IT systems to identify patterns, gain critical insight and enhance decision-making despite daunting complexities. While Deep Blue was amazing, it was an achievement of the application of compute power to a computationally well-defined and well-bound game: Chess. Watson, on the other hand, faces a challenge that is open-ended, defies the well-bounded mathematical formulation of a game like Chess. Watson has to operate in the near limitless, ambiguous, and high contextual domain of human language and knowledge.   Watson answers a Grand Challenge: Can IBM design a computing system that rivals a human’s ability to answer questions posed in natural language by interpreting meaning and context and then retrieving, analyzing and understanding vast amounts of information in real-time? IBM Watson is a breakthrough in analytic innovation, proving that it is possible to harness vast amounts of information and rival a human’s ability to answer questions posted in natural language in real-time. But it doesn't matter how good the machine is if we don’t have good information to feed it. We live in a time where a computer can compete against humans at answering questions in plain English, based on storing, retrieving, analyzing and understanding vast amounts of information at real-time speeds. These same capabilities can enable you to improve and optimize your business, too. IBM just showed the value of putting that information to work by creating a computing system capable of competing on Jeopardy Well there ’ s a lot of technology that went into Watson – and a lot of Big Data technology in there as well. Now take a moment and think about how this iconic game show is played: you have to answer a question within three seconds. The technology used to analyze and return answers in Watson was a pre-cursor to the Streams technology, in fact, Streams was invented because that technology used in Watson wasn’t fast enough for some of the in-motion requirements needed by companies today. Jeopardy questions are not straight forward, they have pun and tricks to make them harder – so some of our text analytic technology with natural language processing, which is part of the IBM Big Data platform, is in there too (that ’ s yet another MAJOR DIFFERENTIATOR for IBM in Big Data: our Text   Analytic Toolkit, which you will hear more about later in this presentation). It wasn’t always smooth sailing for Watson, the big breakthrough came when they started to use machine learning (ML), and the IBM Big Data platform will further differentiate itself from the field in 2012 when a corresponding toolkit came to market just like the text analytics toolkit. Finally, Watson had to have access to a heck of a lot of data – and Big Data technologies were used to load and index over 200 million pages of data; Watson had everything from encyclopedias, to the bible, to the world famous music and movie databases, etc.   All these technologies mentioned in the previous paragraph had to work together as well. So IBM clearly has some inflection point understanding of these technologies and how to get them working together. In the case of the text analytics and machine learning – well we have to make that easier to consume because you don ’ t have the world ’ s largest commercial research organization for math at your fingertips. So we need to build tooling, and optimization, and accelerators around that and put these technologies inside consumable toolkits: which are we doing now.
  4. In order to know we are making progress on scientific problems like open-domain QA well-defined challenges help demonstrate we can solve concrete & difficult tasks. As you might know Jeopardy! Is a long-standing, well-regarded and highly challenging Television quiz show in the US that demands human contestants to quickly understand and answer richly expressed natural language questions over a staggering array of topics. The Jeopardy! Challenge uniquely provides a palpable, compelling and notable way to drive the technology of Question Answering along key dimensions If you are familiar with the quiz show it asks an I incredibly broad range of questions over a huge variety of topics. In a single round there is a grid of 6 Categories and for each category 5 rows with increasing $ values. Once a cell is chosen by 1 of three players, A question, or what is often called a Clue is revealed. Here you see some example questions. <read some of the questions>   Jeopardy uses complex and often subtle language to describe what is being asked.   To win you have to be extraordinarily precise. You must deliver the exact answer – no more and no less – it is not good enough for it be somewhere in the top 2, 10 or 20 documents – you must know it exactly and get it in first place – otherwise no credit – in fact you loose points.   You must demonstrate Accurate Confidences -- That is -- you must know what you know – if you “buzz –in” and then get it wrong you lose the $$ value of the question.   And you have to do this all very quickly – deeply analyze huge volumes of content, consider many possible answers, compute your confidence and buzz in – all in just seconds. As we shall see compete with human champions at this game represents a Grand Challenge in Automatic Open-Domain Question Answering. <STOP> <NEXT SLIDE>
  5. 01/18/12 IOD2011 4/9/12 GS302_ManojSaxena_v7
  6. Main point: At the core of what makes Watson different are three powerful technologies - natural language, hypothesis generation, and evidence based learning. But Watson is more than the sum of its individual parts. Watson is about bringing these capabilities together in a way that ’s never been done before resulting in a fundamental change in the way businesses look at quickly solving problems Further speaking points: . Looking at these one by one, understanding natural language and the way we speak breaks down the communication barrier that has stood in the way between people and their machines for so long. Hypothesis generation bypasses the historic deterministic way that computers function and recognizes that there are various probabilities of various outcomes rather than a single definitive ‘right’ response. And adaptation and learning helps Watson continuously improve in the same way that humans learn….it keeps track of which of its selections were selected by users and which responses got positive feedback thus improving future response generation Additional information : The result is a machine that functions along side of us as an assistant rather than something we wrestle with to get an adequate outcome
  7. Challenge Reduce the occurrence of high cost Congestive Heart Failure (CHF) readmissions by proactively identifying patients likely to be readmitted on an emergent basis. Solution Seton Healthcare is a not-for-profit organization, the Seton Family is the leading provider of healthcare services in Central Texas, serving an 11-county population of 1.9 million Target and understand high-risk CHF patients for care management programs using natural language processing. Used predictive models that have demonstrated high positive predictive value against extracted structured and unstructured data Results Proactively targeted care management which will reduce re-admission of CHF patients. Identified patients likely for re-admission and introduced early interventions which will reduce cost, mortality rates, and improve patient quality of life. Background Seton Healthcare is a not-for-profit organization, the Seton Family is the leading provider of healthcare services in Central Texas, serving an 11-county population of 1.8 million. Seton Healthcare identified an opportunity to significantly reduce the occurrence of high cost CHF readmissions by proactively identifying patients likely to be readmitted on an emergent basis. Objectives Seton will partner with IBM to implement content and predictive analytics to identify patients who should receive proactive medical case management and intervention. The expectation is that Seton can reduce the occurrence of costly readmissions, mortality rates and improve the quality of life for these patients. Project Description CHF prevention and reduced re-admission is a main focuses of Seton’s Clinical Design Center. The key clinical, financial, and contextual data for CHF patients span many applications and are stored in both structured and unstructured content. To achieve the Design Center objectives, the following capabilities are needed: Integrate these data into longitudinal patient records Identify important information in the unstructured data Develop predictive models that show Likelihood of readmission Likelihood of ambulatory-sensitive ED visits and admissions Forecasted next year costs Display predictive model results along with aggregated patient record data in an visual, easily-navigable system IOD2011_BA KEYNOTEIBM IOD 2011 05/10/12 D1_BA Keynote_v4
  8. Key Points Traditional technologies are very well suited to structured, repeatable tasks – when you do something many times it makes sense to structure it Also have controls in place for the accuracy and quality of the data Historical data – trend analysis New technologies are complementary – they address speed and flexibility Very good an one-time or ad-hoc analysis Also good at exploration – determining new questions to ask The point is organizations need both sides – and data growth (or big data) is a challenge for both sides. A big data platform has to address both sides to truly address enterprise needs.
  9. Obviously, there are many other forms and sources of data. Let ’ s start with the hottest topic associated with Big Data today: social networks. Twitter generates about 12 terabytes a day of tweet data – which is every single day. Now, keep in mind, these numbers are hard to count on , so the point is that they ’ re big, right? So don ’ t fixate on the actual number because they change all the time and realize that even if these numbers are out of date in 2 years, it ’ s at a point where it ’ s too staggering to handle exclusively using traditional approaches.   +CLICK+ Facebook over a year ago was generating 25 terabytes of log data every day ( Facebook log data reference: http://www.datacenterknowledge.com/archives/2009/04/17/a-look-inside-facebooks-data-center/ ) and probably about 7 to 8 terabytes of data that goes up on the Internet.   +CLICK+ Google, who knows? Look at Google Plus, YouTube, Google Maps, and all that kind of stuff. So that ’ s the left hand of this chart – the social network layer.   +CLICK+ Now let ’ s get back to instrumentation: there are massive amounts of proliferated technologies that allow us to be more interconnected than in the history of the world – and it just isn’t P2P (people to people) interconnections, it ’ s M2M (machine to machine) as well. Again, with these numbers, who cares what the current number is, I try to keep them updated, but it ’ s the point that even if they are out of date, it ’ s almost unimaginable how large these numbers are. Over 4.6 billion camera phones that leverage built-in GP S to tag the location or your photos, purpose built GPS devices, smart metres. If you recall the bridge that collapsed in Minneapolis a number of years ago in the USA, it was rebuilt with smart sensors inside it that measure the contraction and flex of the concrete based on weather conditions, ice build up, and so much more.   So I didn’t realise how true it was when Sam P launched Smart Planet: I thought it was a marketing play. But truly the world is more instrumented, interconnected, and intelligent than it ’ s ever been and this capability allows us to address new problems and gain new insight never before thought possible and that ’ s what the Big Data opportunity is all about!
  10. We like to define the characteristics of Big Data at IBM as Variety, Velocity and Volume. +CLICK+ If you start at the bottom, volume is pretty simple. We all understand we ’ re going from the terabytes to petabytes and into a zettabytes world, I think most of us understand today just how much data is out there now and what ’ s coming (at least you should after the first couple of slides in this presentation).   The variety aspect is something kind of new to us in the data warehousing rule and it essentially that our analytics no longer just be for structured data; more so, analytics on structured data doesn’t have to be in a traditional database that requires consistency and integrity (since the data won’t be kept long, for example, a log file). The Big Data era is characterized by the need and desires to explore beyond structured data: we want to fold in unstructured data as well. If you look at a Facebook post or a tweet, they may come in a structured format (JSON), but the true value is in the unstructured part; the part that you tweet or your Facebook status and your post, that ’ s really a kind of unstructured data, so we refer to that as semi-structured data. So now we ’ re looking at all sorts of different kinds of data.   Finally, there ’ s velocity. Other vendors who don ’ t have as big of a Big Data scope as we have at IBM will call velocity the speed at which the volume grows, but I think it ’ s fair to say that that ’ s part of volume. We talk about velocity as being how fast the data arrives at the enterprise , and of course, it ’ s going to lead to the question, and how long does it take you to do something about it ? Velocity in this context is a MAJOR IBM differentiator.   Now keep in mind that a Big Data problem could involve solely one of these characteristics, or all of them.
  11. We all know there exists a SQL-controlled relational database warehouse , so why are we at this era of Big Data? I think the two images on this slide really sum it up with a decent analogy around gold mining. If you think about the guy on the left, where you see this old-timer gold miner sifting for gold in a river and he is hoping to find big chunks of gold in his sifter. If someone found finds big chunks of gold, word spread s and that would spark a big gold rush. The find would pave the way for lots of investment, and eventually a town would spring up around this valuable find. What ’ s a characteristic of this scenario? When you look at that gold, you can visually see it and I would refer to gold (data) as having a visible value (high value per byte data). You can see it. It ’ s obvious. It ’ s valuable and therefore I can build a business case and invest in bringing this obvious high value per byte data into the warehouse– which indeed is a Big Data technology. Now bringing data into an warehouse is inherently more expensive (for good reasons), because in a warehouse we are taught that this is pristine data, the single version of the truth, it ’ s got to be enriched, it ’ s got to be documented, glossarized, transformed; and we do that because we know there ’ s a high value per byte data. Now, although mining towns sprung up around a gold find, folks didn’t go and dig up the mountains around the stream. Why? Because there is so much dirt (low value per byte data), and you didn’t have enough information or the right capital equipment to process all that dirt on a hunch. Now think of gold mining today, it ’ s a very different process than what I outlined on the left. In today ’ s gold mining, you actually can ’ t see most of the gold that ’ s found today. Gold has to be 30 parts per million (ppm) ore or greater for you to see it, so most gold mined today isn’t visible to the naked eye. Instead, today there exists massive capital equipment that ’ s able to go through and process lots and lots of dirt (low value per byte data) and find for extraction strains of gold (high value data) that are otherwise invisible to the naked eye. So today ’ s gold mine collects all these strains of gold and brings together value (insight). I was watching a gold mining documentary the other day – and they talked about how they chemically treat the dirt to find even finer grains of gold after a recent discovery, so this particular company was going to go back to the dirt that they’ve already processed, chemically treat it, and find more gold (value) than what was found in the initial extraction. I think analytics is (or will be) just like that and that ’ s yet another reason why Big Data compliments the existing warehouse. Five years from now, we’ll be able to do more and more analytically on the data we have today, and we ’ re going to understand inflection points and trends better that what we can today, and that ’ s just one of the reasons why developing a corpus of information, and keeping it, not only makes today ’ s models more accurate, but presents unknown opportunities for the future. In the end we have to look at ways to synergizing the analysis of data because producing data is much easier than making sense of it, and that rings more and more true each day in a Big Data era.
  12. Slide #4: IBV-MIT DATA You see the data on this chart… from study conducted by our Institute of Business Value and MIT Sloan Management Review Number of enterprises using analytics to create a competitive advantage jumped almost 60 percent in just one year… Nearly 6 out of 10 organizations now differentiating through analytics. We found that the overall increase in advantage went almost exclusively to organizations who were already experienced users of analytics… so the early adopters are extending their leadership. Those organizations are more than twice as likely to substantially outperform their peers   So we’re seeing early bifurcation of the market – leaders and followers. Reinforced by a separate MIT Study that found analytics led to 5-6 percent productivity increases… which is big enough in most industries to separate the winners from the losers. That’s all change that’s happening within enterprises….