SlideShare una empresa de Scribd logo
1 de 32
Polling Question
How Important is Big Data to your business?

___ Very Important

___ Somewhat Important

___ Not Important




                                              Page 1
     © Hortonworks Inc. 2012
Big Data, Hadoop, Hortonworks and
Microsoft HDInsight




© Hortonworks Inc. 2012        Page 2
Your Presenters

 Jim Walker
   •   Director, Prod Marketing
   •   Computer Security and MDM




 •Saptak Sen
   •   Senior Product Manager
   •   Big Data & NoSQL Technology




       © Hortonworks Inc. 2012
Why Data Driven Business?

                                       Data driven decisions are
                                       better decisions – its as simple
                                       as that. Using big data enables
                                       mangers to decide on the
                                       basis of evidence rather than
                                       intuition. For that reason it has
                                       the potential to revolutionize
                                       management
                                       Harvard Business Review
                                       October 2012

1110010100001010011101010100010010100100101001001000010010001001000001000100000100
01001001000100001011100001001000100010100100101111010100100010010010100101001001111
 1001010010100011111010001001010000010010001010010111101010011001001010010001000111

                                                                          Page 4
            © Hortonworks Inc. 2012
Big Data: Organizational Game Changer

                                                                    Transactions + Interactions
Petabytes
                 BIG DATA                       Mobile Web                  + Observations
                                                Sentiment          SMS/MMS

                                                 User Click Stream
                                                                                  = BIG DATA
                                                                         Speech to Text

                                                               Social Interactions & Feeds
 Terabytes       WEB                Web logs
                                                                        Spatial & GPS Coordinates
                                        A/B testing
                                                                               Sensors / RFID / Devices
                                                 Behavioral Targeting
  Gigabytes      CRM                                                                   Business Data Feeds
                                                            Dynamic Pricing
                                    Segmentation                                             External Demographics
                                                                   Search Marketing
                                        Customer Touches                                      User Generated Content
                 ERP
  Megabytes                                                           Affiliate Networks
                  Purchase detail              Support Contacts                                  HD Video, Audio, Images
                                                                        Dynamic Funnels
                  Purchase record
                                                   Offer details           Offer history           Product/Service Logs
                  Payment record



                                                 Increasing Data Variety and Complexity


                                                                                                                           Page 5
              © Hortonworks Inc. 2012
Page 6
© Hortonworks Inc. 2012
Page 7
© Hortonworks Inc. 2012
Page 8
© Hortonworks Inc. 2012
Polling Question
What tools are you using with Big Data

___ Hadoop

___ NOSQL

___ Other

___ All the above




                                         Page 9
     © Hortonworks Inc. 2012
Big Data: Optimize Outcomes at Scale
                   Sports      o p ti m i z e                Championships
       Intelligence            o p ti m i z e                Detection
                Finance        o p ti m i z e                Algorithms
       Advertising             o p ti m i z e                Performance
                     Fraud     o p ti m i z e                Prevention
Retail / Wholesale             o p ti m i z e                Inventory turns
   Manufacturing               o p ti m i z e                Supply chains
         Healthcare            o p ti m i z e                Patient outcomes
           Education           o p ti m i z e                Learning outcomes
     Government                o p ti m i z e                Citizen services
                                          Source: Geoffrey Moore. Hadoop Summit 2012 keynote presentation.

                                                                                                  Page 10
     © Hortonworks Inc. 2012
A little history… it’s 2005




    © Hortonworks Inc. 2012
…and then there was MapReduce




                                Page 12
   © Hortonworks Inc. 2012
Apache Hadoop: Center of Big Data Strategy


Open Source data management                            Key Characteristics
with scale-out storage &                               • Scalable
                                                           – Efficiently store and process
distributed processing                                       petabytes of data
                                                           – Linear scale driven by additional
             HDFS                                            processing and storage
                                                       • Reliable
Storage




             •   Distributed across “nodes”
                                                           – Redundant storage
             •   Natively redundant
                                                           – Failover across nodes and racks
             •   Name node tracks locations
                                                       • Flexible
                                                           – Store all types of data in any format
                                                           – Apply schema on analysis and
             Map Reduce                                      sharing of the data
Processing




             •   Splits a task across processors       • Economical
                 “near” the data & assembles results
                                                           – Use commodity hardware
             •   Self-Healing, High Bandwidth
                                                           – Open source software guards
                 Clustered Storage
                                                             against vendor lock-in


                                                                                             Page 13
                 © Hortonworks Inc. 2012
What is a Hadoop “Distribution”
                                 Templeton   WebHDFS            Sqoop          Flume
A complimentary set
                                             HCatalog
of open source                                                            HBase
                                      Pig               Hive
technologies that                     MapReduce                         HDFS
make up a complete                  Ambari              Oozie                  HA
data platform                                      ZooKeeper




• Tested and pre-packaged to ease installation and usage
• Collects the right versions of the components that all have different
  release cycles and ensures they work together




                                                                                       Page 14
       © Hortonworks Inc. 2012
Apache Hadoop & Big Data Use Cases

                                          Big Data
                             Transactions, Interactions, Observations




                             Refine         Explore          Enrich




                                  Business Case

                                                                        Page 15
   © Hortonworks Inc. 2012
3
                          Patterns of
                           Hadoop
                             Use

                            Refine

                           Explore

                            Enrich




© Hortonworks Inc. 2012
3
                                                                                Patterns of
                                                                                 Hadoop
                                                                                   Use

                                                                                  Refine

                                                                                 Explore

                                                                                  Enrich




© Hortonworks Inc. 2012   Eintein Photo: Courtesy: Wikipedia Creative Commons
3
                                                                                Patterns of
                                                                                 Hadoop
                                                                                   Use

                                                                                  Refine

                                                                                 Explore

                                                                                  Enrich




© Hortonworks Inc. 2012   Eintein Photo: Courtesy: Wikipedia Creative Commons
Balancing Innovation & Stability
                                                                                            • Hadoop is “pre-chasm”
                                                                                            • Ecosystem still evolving
 relative %
customers




                                                                                            • Enterprises endure 1-3
                                                                                              year adoption cycle




                                              The CHASM
         Innovators,               Early                     Early
                                                                           Late majority,            Laggards,
         technology              adopters,                  majority,
                                                                           conservatives              Skeptics
         enthusiasts            visionaries               pragmatists




                                                                                                                          time
                  Customers want                                            Customers want
              technology & performance                                  solutions & convenience

                                                                                             Source: Geoffrey Moore - Crossing the Chasm



                                                                                                                                 Page 19
                 © Hortonworks Inc. 2012







Demonstration




Mining Market Data
 – Showcase back testing on Interactive Data
 – Leveraging Excel Tool & BI Tool




    © Hortonworks Inc. 2012
Looking Ahead | Microsoft PolyBase

                                  “I’ve said it before: Massively Parallel
                                  Processing (MPP) data warehouse
                                  appliances are Big Data databases.”
                                                               - Andrew Brust



 SQL Server PDW
 Single query for relational & Hadoop data
 Process data in place                                              PolyBase
 Seamless: Regular T-SQL command
 Future expansion to other data sources




        © Hortonworks Inc. 2012
Hadoop
   Better on Windows
     • Active Directory
     • System Center


   Microsoft Data Connectivity
     • SQL Server / SQL Parallel Data Warehouse
     • Azure Storage / Azure Data Market


   Microsoft Business Intelligence (BI)
     • ODBC Connectivity
Leading Innovation at the Core

We focus on innovating the
core of Apache Hadoop

• Hortonworks employs the original
                                          MapR
                                           1
                                                                  17
  Architects, Builders and Operators
  of Apache Hadoop

                                                                Yahoo!
• All Apache, NO holdbacks                                        9
  100% of all code contributed back         facebook
                                                4    Cloudera
  to open source Apache projects                        8


                                       Number of Apache Hadoop
                                        Committers by Company


                                                                         Page 27
        © Hortonworks Inc. 2012
What we do…

                                    We believe that by the end of 2015,
                                    more than half the world's data will be
                                    processed by Apache Hadoop.

  Strategy: invest in Apache Hadoop to make it “The enterprise big data platform”


Distribution                           Ecosystem                       Support
• Hortonworks Data                  • Enable an Ecosystem of         • Deliver highest quality
  Platform (HDP)                      Big Data Apps                    support and expertise
• Enterprise Ready, Stable,         • Our goal os to make sure all   • Access to Apache Hadoop
  Reliable, Tested                    your tools work WITH             Experts
• 100% open source                    Hadoop                         • Hadoop training an
• Built by the architects,          • HDP is Hadoop for                certification by the Hadoop
  builders and operators of           • Microsoft                      experts(web, public, private)
  Apache Hadoop                       • Teradata



                                                                                                Page 28
          © Hortonworks Inc. 2012
Page 29
© Hortonworks Inc. 2012
Hadoop in Enterprise Data Architectures
    Existing Business Infrastructure                                                Web                      New Tech

                                                                                                                  Datameer
                                                                                                                   Tableau
                                                                                                                 Karmasphere
   IDE &         ODS &             Applications &   Visualization &                  Web                            Splunk
  Dev Tools     Datamarts          Spreadsheets       Intelligence                Applications

                                                                                                                                Operations

                     Discovery                                                      Low
                       Tools                        EDW                        Latency/NoSQ
                                                                                     L
                                                                                                                                Custom   Existing



                                                             Templeton        WebHDFS             Sqoop            Flume
                                                                             HCatalog
                                                                                                                 HBase
                                                                      Pig                 Hive
                                                                      MapReduce                           HDFS
                                                                    Ambari                Oozie                    HA
                                                                                      ZooKeeper




                                                           Social               Exhaust                   logs          files
       CRM          ERP             financials             Media                 Data


                                                 Big Data Sources
                                     (transactions, observations, interactions)


                                                                                                                                         Page 30
         © Hortonworks Inc. 2012
Big Data: It’s About Scale & Structure
  RDBMS                           EDW          MPP          NoSQL              Hadoop
                                Structured   data types     Multi and unstructured

 Limited, no data processing                 processing     Processing coupled with data

   Standards and structured                  governance     Loosely structured

              Required on write               schema        Required on read

                     Reads are fast            speed        Writes are fast

               Software License                 cost        Support only

                        Known entity         resources      Growing, complexities, wide

  Interactive OLAP Analytics                                Data Discovery
 Complex ACID Transactions                   best fit use   Processing unstructured data
      Operational Data Store                                Massive Storage/Processing

                                                                                     Page 31
      © Hortonworks Inc. 2012
Big Data, Hadoop, Hortonworks and Microsoft HDInsight

Más contenido relacionado

La actualidad más candente

Actian forrester- hortonworks
Actian   forrester- hortonworksActian   forrester- hortonworks
Actian forrester- hortonworksHortonworks
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack EuropeHortonworks
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsHortonworks
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata Hortonworks
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks
 
Break Through the Traditional Advertisement Services with Big Data and Apache...
Break Through the Traditional Advertisement Services with Big Data and Apache...Break Through the Traditional Advertisement Services with Big Data and Apache...
Break Through the Traditional Advertisement Services with Big Data and Apache...Hortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group Hortonworks
 
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Hortonworks
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Hortonworks
 
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHow to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHortonworks
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
 
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Hortonworks
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data LakeVMware Tanzu
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalHortonworks
 
Enterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble StorageEnterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble StorageHortonworks
 
The Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen ModernizationThe Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen ModernizationHortonworks
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks
 
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseHybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseDataWorks Summit
 

La actualidad más candente (20)

Actian forrester- hortonworks
Actian   forrester- hortonworksActian   forrester- hortonworks
Actian forrester- hortonworks
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data Analytics
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
Break Through the Traditional Advertisement Services with Big Data and Apache...
Break Through the Traditional Advertisement Services with Big Data and Apache...Break Through the Traditional Advertisement Services with Big Data and Apache...
Break Through the Traditional Advertisement Services with Big Data and Apache...
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group
 
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
 
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHow to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
Leverage Big Data to Enhance Customer Experience in Telecommunications – with...
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
Enterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble StorageEnterprise Hadoop with Hortonworks and Nimble Storage
Enterprise Hadoop with Hortonworks and Nimble Storage
 
The Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen ModernizationThe Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen Modernization
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data WarehouseHybrid Data Architecture: Integrating Hadoop with a Data Warehouse
Hybrid Data Architecture: Integrating Hadoop with a Data Warehouse
 

Destacado

Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHortonworks
 
Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol
Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol
Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol HARMAN Services
 
Tackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationTackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationDataWorks Summit
 
Hadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation ArchitecturesHadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation ArchitecturesDataWorks Summit
 
Spark Streaming
Spark StreamingSpark Streaming
Spark StreamingEdureka!
 
Practical Kerberos with Apache HBase
Practical Kerberos with Apache HBasePractical Kerberos with Apache HBase
Practical Kerberos with Apache HBaseJosh Elser
 
Apache Phoenix Query Server
Apache Phoenix Query ServerApache Phoenix Query Server
Apache Phoenix Query ServerJosh Elser
 
Interface fonctionnelle, Lambda expression, méthode par défaut, référence de...
Interface fonctionnelle, Lambda expression, méthode par défaut,  référence de...Interface fonctionnelle, Lambda expression, méthode par défaut,  référence de...
Interface fonctionnelle, Lambda expression, méthode par défaut, référence de...MICHRAFY MUSTAFA
 
Scala: Pattern matching, Concepts and Implementations
Scala: Pattern matching, Concepts and ImplementationsScala: Pattern matching, Concepts and Implementations
Scala: Pattern matching, Concepts and ImplementationsMICHRAFY MUSTAFA
 
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013Kai Wähner
 
Apache Phoenix: Transforming HBase into a SQL Database
Apache Phoenix: Transforming HBase into a SQL DatabaseApache Phoenix: Transforming HBase into a SQL Database
Apache Phoenix: Transforming HBase into a SQL DatabaseDataWorks Summit
 
Apache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAseApache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAseenissoz
 
Scala : programmation fonctionnelle
Scala : programmation fonctionnelleScala : programmation fonctionnelle
Scala : programmation fonctionnelleMICHRAFY MUSTAFA
 
The First Class Integration of Solr with Hadoop
The First Class Integration of Solr with HadoopThe First Class Integration of Solr with Hadoop
The First Class Integration of Solr with Hadooplucenerevolution
 
Spark RDD : Transformations & Actions
Spark RDD : Transformations & ActionsSpark RDD : Transformations & Actions
Spark RDD : Transformations & ActionsMICHRAFY MUSTAFA
 
Open Source ETL using Talend Open Studio
Open Source ETL using Talend Open StudioOpen Source ETL using Talend Open Studio
Open Source ETL using Talend Open Studiosantosluis87
 
Spark SQL principes et fonctions
Spark SQL principes et fonctionsSpark SQL principes et fonctions
Spark SQL principes et fonctionsMICHRAFY MUSTAFA
 
Apache SPARK ML : principes, concepts et mise en œuvre
Apache SPARK  ML : principes, concepts et  mise en œuvre Apache SPARK  ML : principes, concepts et  mise en œuvre
Apache SPARK ML : principes, concepts et mise en œuvre MICHRAFY MUSTAFA
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseDataWorks Summit/Hadoop Summit
 

Destacado (20)

Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 
Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol
Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol
Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol
 
Tackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationTackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integration
 
Hadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation ArchitecturesHadoop's Opportunity to Power Next-Generation Architectures
Hadoop's Opportunity to Power Next-Generation Architectures
 
vBACD July 2012 - Apache Hadoop, Now and Beyond
vBACD July 2012 - Apache Hadoop, Now and BeyondvBACD July 2012 - Apache Hadoop, Now and Beyond
vBACD July 2012 - Apache Hadoop, Now and Beyond
 
Spark Streaming
Spark StreamingSpark Streaming
Spark Streaming
 
Practical Kerberos with Apache HBase
Practical Kerberos with Apache HBasePractical Kerberos with Apache HBase
Practical Kerberos with Apache HBase
 
Apache Phoenix Query Server
Apache Phoenix Query ServerApache Phoenix Query Server
Apache Phoenix Query Server
 
Interface fonctionnelle, Lambda expression, méthode par défaut, référence de...
Interface fonctionnelle, Lambda expression, méthode par défaut,  référence de...Interface fonctionnelle, Lambda expression, méthode par défaut,  référence de...
Interface fonctionnelle, Lambda expression, méthode par défaut, référence de...
 
Scala: Pattern matching, Concepts and Implementations
Scala: Pattern matching, Concepts and ImplementationsScala: Pattern matching, Concepts and Implementations
Scala: Pattern matching, Concepts and Implementations
 
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
 
Apache Phoenix: Transforming HBase into a SQL Database
Apache Phoenix: Transforming HBase into a SQL DatabaseApache Phoenix: Transforming HBase into a SQL Database
Apache Phoenix: Transforming HBase into a SQL Database
 
Apache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAseApache phoenix: Past, Present and Future of SQL over HBAse
Apache phoenix: Past, Present and Future of SQL over HBAse
 
Scala : programmation fonctionnelle
Scala : programmation fonctionnelleScala : programmation fonctionnelle
Scala : programmation fonctionnelle
 
The First Class Integration of Solr with Hadoop
The First Class Integration of Solr with HadoopThe First Class Integration of Solr with Hadoop
The First Class Integration of Solr with Hadoop
 
Spark RDD : Transformations & Actions
Spark RDD : Transformations & ActionsSpark RDD : Transformations & Actions
Spark RDD : Transformations & Actions
 
Open Source ETL using Talend Open Studio
Open Source ETL using Talend Open StudioOpen Source ETL using Talend Open Studio
Open Source ETL using Talend Open Studio
 
Spark SQL principes et fonctions
Spark SQL principes et fonctionsSpark SQL principes et fonctions
Spark SQL principes et fonctions
 
Apache SPARK ML : principes, concepts et mise en œuvre
Apache SPARK  ML : principes, concepts et  mise en œuvre Apache SPARK  ML : principes, concepts et  mise en œuvre
Apache SPARK ML : principes, concepts et mise en œuvre
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
 

Similar a Big Data, Hadoop, Hortonworks and Microsoft HDInsight

Hadoop: What It Is and What It's Not
Hadoop: What It Is and What It's NotHadoop: What It Is and What It's Not
Hadoop: What It Is and What It's NotInside Analysis
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshowAccenture
 
Powering Next Generation Data Architecture With Apache Hadoop
Powering Next Generation Data Architecture With Apache HadoopPowering Next Generation Data Architecture With Apache Hadoop
Powering Next Generation Data Architecture With Apache HadoopHortonworks
 
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisHadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisOW2
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsHortonworks
 
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks
 
Scaling MySQL: Catch 22 of Read Write Splitting
Scaling MySQL: Catch 22 of Read Write SplittingScaling MySQL: Catch 22 of Read Write Splitting
Scaling MySQL: Catch 22 of Read Write SplittingScaleBase
 
The Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information ArchitectureThe Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information ArchitectureInside Analysis
 
Scaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data DistributionScaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data DistributionScaleBase
 
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and BeyondHadoop - Now, Next and Beyond
Hadoop - Now, Next and BeyondTeradata Aster
 
IBM Information Management - Efter stormen: Uppnå konkurrenskraft och sänkta ...
IBM Information Management - Efter stormen: Uppnå konkurrenskraft och sänkta ...IBM Information Management - Efter stormen: Uppnå konkurrenskraft och sänkta ...
IBM Information Management - Efter stormen: Uppnå konkurrenskraft och sänkta ...IBM Sverige
 
ScaleBase Webinar 8.16: ScaleUp vs. ScaleOut
ScaleBase Webinar 8.16: ScaleUp vs. ScaleOutScaleBase Webinar 8.16: ScaleUp vs. ScaleOut
ScaleBase Webinar 8.16: ScaleUp vs. ScaleOutScaleBase
 
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL DatabaseScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL DatabaseScaleBase
 
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data European Data Forum
 
Unified big data architecture
Unified big data architectureUnified big data architecture
Unified big data architectureDataWorks Summit
 
The Future of ERP by Bertrand Andries
The Future of ERP by Bertrand Andries  The Future of ERP by Bertrand Andries
The Future of ERP by Bertrand Andries CONFENIS 2012
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overviewoptier
 

Similar a Big Data, Hadoop, Hortonworks and Microsoft HDInsight (20)

Hadoop: What It Is and What It's Not
Hadoop: What It Is and What It's NotHadoop: What It Is and What It's Not
Hadoop: What It Is and What It's Not
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshow
 
Powering Next Generation Data Architecture With Apache Hadoop
Powering Next Generation Data Architecture With Apache HadoopPowering Next Generation Data Architecture With Apache Hadoop
Powering Next Generation Data Architecture With Apache Hadoop
 
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisHadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
 
2012 06 hortonworks paris hug
2012 06 hortonworks paris hug2012 06 hortonworks paris hug
2012 06 hortonworks paris hug
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for Windows
 
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
 
Scaling MySQL: Catch 22 of Read Write Splitting
Scaling MySQL: Catch 22 of Read Write SplittingScaling MySQL: Catch 22 of Read Write Splitting
Scaling MySQL: Catch 22 of Read Write Splitting
 
IBM Big Data Platform Nov 2012
IBM Big Data Platform Nov 2012IBM Big Data Platform Nov 2012
IBM Big Data Platform Nov 2012
 
The Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information ArchitectureThe Comprehensive Approach: A Unified Information Architecture
The Comprehensive Approach: A Unified Information Architecture
 
Scaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data DistributionScaling MySQL: Benefits of Automatic Data Distribution
Scaling MySQL: Benefits of Automatic Data Distribution
 
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and BeyondHadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
 
IBM Information Management - Efter stormen: Uppnå konkurrenskraft och sänkta ...
IBM Information Management - Efter stormen: Uppnå konkurrenskraft och sänkta ...IBM Information Management - Efter stormen: Uppnå konkurrenskraft och sänkta ...
IBM Information Management - Efter stormen: Uppnå konkurrenskraft och sänkta ...
 
ScaleBase Webinar 8.16: ScaleUp vs. ScaleOut
ScaleBase Webinar 8.16: ScaleUp vs. ScaleOutScaleBase Webinar 8.16: ScaleUp vs. ScaleOut
ScaleBase Webinar 8.16: ScaleUp vs. ScaleOut
 
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL DatabaseScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
ScaleBase Webinar: Methods and Challenges to Scale Out a MySQL Database
 
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
EDF2013: Selected Talk: Bryan Drexler: The 80/20 Rule and Big Data
 
Unified big data architecture
Unified big data architectureUnified big data architecture
Unified big data architecture
 
The Future of ERP by Bertrand Andries
The Future of ERP by Bertrand Andries  The Future of ERP by Bertrand Andries
The Future of ERP by Bertrand Andries
 
Query at Speed of Thought
Query at Speed of ThoughtQuery at Speed of Thought
Query at Speed of Thought
 
McKinsey Big Data Overview
McKinsey Big Data OverviewMcKinsey Big Data Overview
McKinsey Big Data Overview
 

Más de Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

Más de Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Último

Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxleah joy valeriano
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 

Último (20)

Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptxFINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
FINALS_OF_LEFT_ON_C'N_EL_DORADO_2024.pptx
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 

Big Data, Hadoop, Hortonworks and Microsoft HDInsight

  • 1. Polling Question How Important is Big Data to your business? ___ Very Important ___ Somewhat Important ___ Not Important Page 1 © Hortonworks Inc. 2012
  • 2. Big Data, Hadoop, Hortonworks and Microsoft HDInsight © Hortonworks Inc. 2012 Page 2
  • 3. Your Presenters Jim Walker • Director, Prod Marketing • Computer Security and MDM •Saptak Sen • Senior Product Manager • Big Data & NoSQL Technology © Hortonworks Inc. 2012
  • 4. Why Data Driven Business? Data driven decisions are better decisions – its as simple as that. Using big data enables mangers to decide on the basis of evidence rather than intuition. For that reason it has the potential to revolutionize management Harvard Business Review October 2012 1110010100001010011101010100010010100100101001001000010010001001000001000100000100 01001001000100001011100001001000100010100100101111010100100010010010100101001001111 1001010010100011111010001001010000010010001010010111101010011001001010010001000111 Page 4 © Hortonworks Inc. 2012
  • 5. Big Data: Organizational Game Changer Transactions + Interactions Petabytes BIG DATA Mobile Web + Observations Sentiment SMS/MMS User Click Stream = BIG DATA Speech to Text Social Interactions & Feeds Terabytes WEB Web logs Spatial & GPS Coordinates A/B testing Sensors / RFID / Devices Behavioral Targeting Gigabytes CRM Business Data Feeds Dynamic Pricing Segmentation External Demographics Search Marketing Customer Touches User Generated Content ERP Megabytes Affiliate Networks Purchase detail Support Contacts HD Video, Audio, Images Dynamic Funnels Purchase record Offer details Offer history Product/Service Logs Payment record Increasing Data Variety and Complexity Page 5 © Hortonworks Inc. 2012
  • 9. Polling Question What tools are you using with Big Data ___ Hadoop ___ NOSQL ___ Other ___ All the above Page 9 © Hortonworks Inc. 2012
  • 10. Big Data: Optimize Outcomes at Scale Sports o p ti m i z e Championships Intelligence o p ti m i z e Detection Finance o p ti m i z e Algorithms Advertising o p ti m i z e Performance Fraud o p ti m i z e Prevention Retail / Wholesale o p ti m i z e Inventory turns Manufacturing o p ti m i z e Supply chains Healthcare o p ti m i z e Patient outcomes Education o p ti m i z e Learning outcomes Government o p ti m i z e Citizen services Source: Geoffrey Moore. Hadoop Summit 2012 keynote presentation. Page 10 © Hortonworks Inc. 2012
  • 11. A little history… it’s 2005 © Hortonworks Inc. 2012
  • 12. …and then there was MapReduce Page 12 © Hortonworks Inc. 2012
  • 13. Apache Hadoop: Center of Big Data Strategy Open Source data management Key Characteristics with scale-out storage & • Scalable – Efficiently store and process distributed processing petabytes of data – Linear scale driven by additional HDFS processing and storage • Reliable Storage • Distributed across “nodes” – Redundant storage • Natively redundant – Failover across nodes and racks • Name node tracks locations • Flexible – Store all types of data in any format – Apply schema on analysis and Map Reduce sharing of the data Processing • Splits a task across processors • Economical “near” the data & assembles results – Use commodity hardware • Self-Healing, High Bandwidth – Open source software guards Clustered Storage against vendor lock-in Page 13 © Hortonworks Inc. 2012
  • 14. What is a Hadoop “Distribution” Templeton WebHDFS Sqoop Flume A complimentary set HCatalog of open source HBase Pig Hive technologies that MapReduce HDFS make up a complete Ambari Oozie HA data platform ZooKeeper • Tested and pre-packaged to ease installation and usage • Collects the right versions of the components that all have different release cycles and ensures they work together Page 14 © Hortonworks Inc. 2012
  • 15. Apache Hadoop & Big Data Use Cases Big Data Transactions, Interactions, Observations Refine Explore Enrich Business Case Page 15 © Hortonworks Inc. 2012
  • 16. 3 Patterns of Hadoop Use Refine Explore Enrich © Hortonworks Inc. 2012
  • 17. 3 Patterns of Hadoop Use Refine Explore Enrich © Hortonworks Inc. 2012 Eintein Photo: Courtesy: Wikipedia Creative Commons
  • 18. 3 Patterns of Hadoop Use Refine Explore Enrich © Hortonworks Inc. 2012 Eintein Photo: Courtesy: Wikipedia Creative Commons
  • 19. Balancing Innovation & Stability • Hadoop is “pre-chasm” • Ecosystem still evolving relative % customers • Enterprises endure 1-3 year adoption cycle The CHASM Innovators, Early Early Late majority, Laggards, technology adopters, majority, conservatives Skeptics enthusiasts visionaries pragmatists time Customers want Customers want technology & performance solutions & convenience Source: Geoffrey Moore - Crossing the Chasm Page 19 © Hortonworks Inc. 2012
  • 20.
  • 24. Demonstration Mining Market Data – Showcase back testing on Interactive Data – Leveraging Excel Tool & BI Tool © Hortonworks Inc. 2012
  • 25. Looking Ahead | Microsoft PolyBase “I’ve said it before: Massively Parallel Processing (MPP) data warehouse appliances are Big Data databases.” - Andrew Brust  SQL Server PDW  Single query for relational & Hadoop data  Process data in place PolyBase  Seamless: Regular T-SQL command  Future expansion to other data sources © Hortonworks Inc. 2012
  • 26. Hadoop Better on Windows • Active Directory • System Center Microsoft Data Connectivity • SQL Server / SQL Parallel Data Warehouse • Azure Storage / Azure Data Market Microsoft Business Intelligence (BI) • ODBC Connectivity
  • 27. Leading Innovation at the Core We focus on innovating the core of Apache Hadoop • Hortonworks employs the original MapR 1 17 Architects, Builders and Operators of Apache Hadoop Yahoo! • All Apache, NO holdbacks 9 100% of all code contributed back facebook 4 Cloudera to open source Apache projects 8 Number of Apache Hadoop Committers by Company Page 27 © Hortonworks Inc. 2012
  • 28. What we do… We believe that by the end of 2015, more than half the world's data will be processed by Apache Hadoop. Strategy: invest in Apache Hadoop to make it “The enterprise big data platform” Distribution Ecosystem Support • Hortonworks Data • Enable an Ecosystem of • Deliver highest quality Platform (HDP) Big Data Apps support and expertise • Enterprise Ready, Stable, • Our goal os to make sure all • Access to Apache Hadoop Reliable, Tested your tools work WITH Experts • 100% open source Hadoop • Hadoop training an • Built by the architects, • HDP is Hadoop for certification by the Hadoop builders and operators of • Microsoft experts(web, public, private) Apache Hadoop • Teradata Page 28 © Hortonworks Inc. 2012
  • 30. Hadoop in Enterprise Data Architectures Existing Business Infrastructure Web New Tech Datameer Tableau Karmasphere IDE & ODS & Applications & Visualization & Web Splunk Dev Tools Datamarts Spreadsheets Intelligence Applications Operations Discovery Low Tools EDW Latency/NoSQ L Custom Existing Templeton WebHDFS Sqoop Flume HCatalog HBase Pig Hive MapReduce HDFS Ambari Oozie HA ZooKeeper Social Exhaust logs files CRM ERP financials Media Data Big Data Sources (transactions, observations, interactions) Page 30 © Hortonworks Inc. 2012
  • 31. Big Data: It’s About Scale & Structure RDBMS EDW MPP NoSQL Hadoop Structured data types Multi and unstructured Limited, no data processing processing Processing coupled with data Standards and structured governance Loosely structured Required on write schema Required on read Reads are fast speed Writes are fast Software License cost Support only Known entity resources Growing, complexities, wide Interactive OLAP Analytics Data Discovery Complex ACID Transactions best fit use Processing unstructured data Operational Data Store Massive Storage/Processing Page 31 © Hortonworks Inc. 2012

Notas del editor

  1. For the visual thinkers out there, let’s expand our mathematical model to show some concrete examples.ERP, SCM, CRM, and transactional Web applications are classic examples of systems processing Transactions. Highly structured data in these systems is typically stored in SQL databases.Interactions are about how people and things interact with each other or with your business. Web Logs, User Click Streams, Social Interactions & Feeds, and User-Generated Content are classic places to find Interaction data.Observational data tends to come from the “Internet of Things”. Sensors for heat, motion, pressure and RFID and GPS chips within such things as mobile devices, ATM machines, and even aircraft engines provide just some examples of “things” that output Observation data.Most folks would agree that video is “big” data. The analysis of what’s happening in that video (ie. What you, me, and others are doing in the video) may not be “big” but it is valuable and it does fit under our umbrella.Moreover, business data feeds and publicly available data sets are also “big data”.So we should not minimize our thinking to just data that flows through an organization.Ex. The mortgage-related data you may have COULD benefit from being blended with external data found in Zillow, for example.The government, for example, has the Open Data Initiative. Which means that more and more data is being made publicly available.One of the use cases I find interesting is the Predictive Policing use case where state/local law enforcement is using analytics applied to crime databases and other publicly available data to help predict where and when pockets of crime might be springing up. These proactive analytics efforts have yielded real reductions in crime!Anyhow, this is what Big Data means to me…hopefully it makes sense to you. It is important to note that we think of big data beyond the traditional concepts of volume, velocity and variety into transactions, interactions and observations. In reality, this IS the big data our customers are dealing with.
  2. Gray Systems Lab, Dr. David DeWittFuture of query processingOne interface to query relational & Hadoop dataQuery data without moving itExpanding to other data sources in the futureSeamless integration with unstructured data & hadoopBreakthrough technologyGrey systems lab - DeWitt It’s going to dramatically simplify how users query relational and Hadoop dataFuture of query processingPioneered in the Jim Gray Systems Labs by David DeWitt, PolyBase is a federated query processor in SQL Server 2012 Parallel Data Warehouse which represents a breakthrough innovation from traditional query processing to join structured and unstructured data from Hadoop together. Without manual intervention, PolyBase Query Processor can accept a standard SQL query and combine tables from a relational source with tables from a Hadoop source directly through external tables.  As well, PolyBase Query Processor parallelizes the ability to import/export data to and from Hadoop giving PDW speed, simplicity, and responsiveness in addressing these new types of queries.Ability to issue standard T-SQL that joins relational data with unstructured data in Hadoop PolyBase rapidly imports/exports data between Hadoop and PDW in parallel3) PolyBase can query data in Hadoop directly without movement (with external tables)4) Created in “Gray Systems Labs” by David DeWitt
  3. And that's the second thing I wanted to share with you this afternoon
  4. We believe that Hadoop can be in a position to process more than half the world’s data. I’ve talked to a variety of industry analysts, and there’s not a big argument over Hadoop’s opportunity to achieve this. Some would argue it should be 2016 or 2017, rather than 2015. But we believe aggressive goals help focus people on the right things, so let’s keep it 2015 for now, and let’s see how close we can get. The point here is that this statement can act as our “north star” and help guide our way as we focus on our list of 5 items we can be doing:Be diligent stewards of the open source coreBe tireless innovators beyond the coreProvide robust data platform services & open APIsEnable ecosystem at each layer of the stackMake platform enterprise-ready & easy to use