SlideShare a Scribd company logo
1 of 16
Download to read offline
the open source database you’ll never outgrow




      VoltDB Streamlining
     Hadoop for Enterprise
           Adoption

September 2012
Mark Hydar
Market Technology and Strategy
Agenda

  “Big Data” and the Data Landscape
  Our Thoughts on Data Pipelines
  VoltDB Streaming Overview
  Addressing the Topics
         + Hadoop is too complex and expensive for mainstream enterprises.
         + It’s taking too long to find useful insights amid an ocean of low quality, disconnected data.
         + How can my organization reduce costs and mitigate data risks?
         + How can I gain quicker access to operational insights?
         + What can I do to improve data quality and reduce total pipeline processing times?

  Q&A

VoltDB                                                                                                     2
What is “Big Data”?
               Velocity =
               Big Data =          + Volume + Variety

  The old equation of Big Data
         big data = volume = warehouse (OLAP)

  This has changed
         big data = velocity + volume = transactions (OLTP) + warehouse (OLAP)

  Big Data is fast and deep
  As it arrives, you probably do something with it (or
   wish you could)
         you may just want to slow it down!

VoltDB                                                                           3
Data Value Chain
           Value of                                                                                 Aggregate
           Individual                                                                               Data Value
           Data Item




                                                                                                                              Data Value
                                                Age of Data
    Interactive         Real time Analytics     Record Lookup             Historical Analytics        Exploratory
                                                                                                      Analytics
    Milliseconds        Hundredths of seconds   Second(s)                 Minutes                     Hours
    . Place trade       . Calculate risk        . Retrieve click stream   . Backtest algo             . Algo discovery
    . Serve ad          . Leaderboard           . Show orders             . Business Intelligence     . Log analysis
    . Enrich stream     . Aggregate                                       . Daily reports             . Fraud pattern match
    . Examine packet    . Count
    . Approve trans.


VoltDB                                                                                                                    4
Database Landscape
Fast                       Value of
Complex                    Individual                                                                                Aggregate
Large                      Data Item                                                                                 Data Value
 Application Complexity




                                          VoltDB                                                        Hadoop, etc.
                                                                       NoSQL




                                                                                                                                       Data Value
                                                                                     Data Warehouse
                                 NewSQL


                                                        Traditional RDBMS
Simple
Slow
Small


                          Transactional                                                                       Analytic

                          Interactive      Real time Analytics   Record Lookup   Historical Analytics      Exploratory Analytics


         VoltDB                                                                                                                    5
Lifecycle for Big Data
                    trades        clicks
              logins authorizations
            impressions sensors orders
                                                    Make the most informed
                                                     decision every time there
               Real-time Decision Making             is an interaction
                                                    Real-time decisions are
                 Who: Automated                      informed by operational
                 What: Transact,                     analytics & past
                       Operational Analytics
                                                     knowledge
                                                    Sometimes called OLTP
                  Reports & Analytics
  Knowledge
                 Who: Analyst
                                                    “It is not enough to capture massive
                 What: Reports, Drill down, …
                                                    amounts of data; organizations must
                                                    also sift through the data, extract
                                                    information and transform it into
                  Exploratory Analysis              actionable knowledge.”
Knowledge

                 Who: Data Scientist
                 What: Discover trends, rules, …
   VoltDB                                                                              6
The Value Available in Fast Data

  Data-driven decisions in real-time

  Better decisions by using more information sources

  Faster decisions

  Insights into real-time operational analytics


    If I could “Access” to my data sooner, I would be more Insightful than
    the next guy.                        -What your competition is thinking right now

VoltDB                                                                                  7
The Typical Hadoop Data Pipeline
     Complex, Hard                 Slow, Expensive,
       to Do and                      Lacks Data                    Aged Data
       Expensive                       Visibility
     STRUCTURED
•    Social Media
•    Marketing
     Campaigns
•    Customer Profiles
•    Account
     Transactions
•    Special Offers
•    Sensor Data

     Un-Structured
    Semi-Structured
•    Web Logs
•    JSON
•    Streaming
     Technologies


          Ingest         Parse & Prepare     Transform and Clean   Extract and Load
VoltDB                                                                          8
The VoltDB Database
 • High-performance                             VoltDB Performance Advantage
   RDBMS                                      TPC-C single node (Oracle)       45 X


 • In-memory database                         TPC-C single node (MySQL)       100 X



 • Automatic scale-out on
   commodity servers                                Cost Disruption
                                                  Ex. Traditional
 • Built-in high availability                         RDBMS
                                                                           VoltDB

                                                      SPARC
                                                                      18 , 8-core Intel
 • Relational structures,            System     SuperCluster/Oracle
                                                       11g
                                                                          servers

   ACID and SQL                    Price/tpmC          $ 1.01              $0.012



              is Faster, Better, Cheaper than the competition
VoltDB                                                                                9
How VoltDB is Used

  High throughput, relentless data feeds
  Fast operations on high value data
  Real-time analytics present immediate visibility

                                          Data Feed                Real-time                Real-time
                                                                   Operations
                                                               Examine packet by            Analytics
                                                                                       Identify bandwidth
         Network Traffic Monitoring    Network packets
                                                               source / destination    outliers
                                                                                       Recall post trade
         Financial Trade Support       Market orders           Ingest trade data
                                                                                       order groupings
                                                               Identification and      Notification and
         Sensor tracking & analytics   Sensor position feed
                                                               cleansing of tag info   groupings
                                                               Game state updates
         Mobile Gaming                 Online game                                     Leaderboard lookups
                                                               and usage patterns
                                                                                       Report ad
         Digital Ad Tech               Ad bid / click stream   Bid, optimize content
                                                                                       performance


VoltDB                                                                                                10
Why Address the Streaming Gap

  VoltDB Hadoop Data Streaming
         + Real Time Business Decisioning
         + Data Quality and Enrichment
         + Simplifies Data Integration
         + Shortens Time to Market

  Increasing Productivity
         + Common Development and Data Environment
            — Provides Reusability (data flows and computations)
            — Provides Universal Access to Real Time Data
            — Uses well known data access utilities



VoltDB                                                             11
Hadoop Integration
 Motivation
         + Big Data = high velocity (VoltDB) + high volume (Hadoop)
            — VoltDB ingests fire hose, manages state, supports real-time
              analytics, spools to Hadoop
            — Hadoop imports from VoltDB (via Sqoop)
 Technologies
         + VoltDB Export
            — Real-time streaming export
            — Data consolidation, aggregation, enrichment
            — Buffering and overflow to eliminate “impedance mismatches”
            — Bi-directional durability
         + Sqoop
            — Cloudera-authored DBMS=>HDFS importer
            — Pull-based technology

VoltDB                                                                      12
The Optimized Data Pipeline
    Repeatable                     Real Time           Reduced              Business
Integration Model                   Business        Infrastructure          Analytics
(Reduced Complexity)              Intelligence       (Predictability)     (Warehousing)
     STRUCTURED
•    Social Media
•    Marketing
     Campaigns
•    Customer Profiles
•    Account
                             Real Time
     Transactions            Analytics
•    Special Offers
•    Sensor Data         •   SQL/Relational
                         •   ACID
                         •   Scalable
     Un-Structured       •   Data
                             Enrichment
    Semi-Structured
•    Web Logs
•    JSON
•    Streaming
     Technologies


          Ingest                  Parse & Prepare   Transform and Clean   Extract and Load
VoltDB                                                                                    13
VoltDB in the Big Data Landscape (Today)
   Online
  gaming

        Ad
   serving

   Sensor
     data
                      SQL/Relational
 Financial                 ACID
     trade
                     High Performance
  Internet               Scalable
commerce              Highly Available

   SaaS,            Benchmarked at
  Web 2.0          3+ million ACID TPS
    Mobile
 platforms



   VoltDB                                14
The Close

  As Hadoop adoption increases, it has become evident that programming
   Hadoop is too complex and expensive for mainstream enterprises.
  It’s taking too long to find useful insights amid an ocean of low quality,
   disconnected data. How do I address the key barriers to Hadoop
   adoption?
  How can organizations reduce costs and mitigate data risks?
  How can they gain quicker access to operational insights?
  What can I do to improve data quality and reduce total pipeline processing
   times?
  Did we explore strategies that leading organizations are using to streamline
   Hadoop processing and eliminate adoption challenges?


VoltDB                                                                          15
Questions?

                 email mhydar@voltdb.com
                     twitter @mhydar

         Download the VoltDB Enterprise Edition Trial
          http://voltdb.com/products-services/downloads

                 Join the VoltDB Community
                  http://community.voltdb.com

                 More information on VoltDB Blog
                 http://voltdb.com/company/blog

                 Follow @VoltDB on twitter

VoltDB                                                    16

More Related Content

What's hot

Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrGrant Ingersoll
 
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of SybaseBig Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of SybaseBigDataCloud
 
2012.04.26 big insights streams im forum2
2012.04.26 big insights streams im forum22012.04.26 big insights streams im forum2
2012.04.26 big insights streams im forum2Wilfried Hoge
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data SolutionsMark Kromer
 
Dev Lakhani, Data Scientist at Batch Insights "Real Time Big Data Applicatio...
Dev Lakhani, Data Scientist at Batch Insights  "Real Time Big Data Applicatio...Dev Lakhani, Data Scientist at Batch Insights  "Real Time Big Data Applicatio...
Dev Lakhani, Data Scientist at Batch Insights "Real Time Big Data Applicatio...Dataconomy Media
 
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with HadoopBig Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with HadoopCaserta
 
Pitfalls of Data Warehousing_2019-04-24
Pitfalls of Data Warehousing_2019-04-24Pitfalls of Data Warehousing_2019-04-24
Pitfalls of Data Warehousing_2019-04-24Martin Bém
 
Thu-310pm-Impetus-SachinAndAjay
Thu-310pm-Impetus-SachinAndAjayThu-310pm-Impetus-SachinAndAjay
Thu-310pm-Impetus-SachinAndAjayAjay Shriwastava
 
Four Problems You Run into When DIY-ing a “Big Data” Analytics System
Four Problems You Run into When DIY-ing a “Big Data” Analytics SystemFour Problems You Run into When DIY-ing a “Big Data” Analytics System
Four Problems You Run into When DIY-ing a “Big Data” Analytics SystemTreasure Data, Inc.
 
BigData & CDN - OOP2011 (Pavlo Baron)
BigData & CDN - OOP2011 (Pavlo Baron)BigData & CDN - OOP2011 (Pavlo Baron)
BigData & CDN - OOP2011 (Pavlo Baron)Pavlo Baron
 
Data architecture for modern enterprise
Data architecture for modern enterpriseData architecture for modern enterprise
Data architecture for modern enterprisekayalvizhi kandasamy
 
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012Gigaom
 
Big Data: An Overview
Big Data: An OverviewBig Data: An Overview
Big Data: An OverviewC. Scyphers
 
Database Shootout: What's best for BI?
Database Shootout: What's best for BI?Database Shootout: What's best for BI?
Database Shootout: What's best for BI?Jos van Dongen
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKRajesh Jayarman
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHortonworks
 
Oncrawl elasticsearch meetup france #12
Oncrawl elasticsearch meetup france #12Oncrawl elasticsearch meetup france #12
Oncrawl elasticsearch meetup france #12Tanguy MOAL
 

What's hot (19)

Big data analytics - hadoop
Big data analytics - hadoopBig data analytics - hadoop
Big data analytics - hadoop
 
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and SolrLarge Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
 
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of SybaseBig Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
Big Data Analytics in a Heterogeneous World - Joydeep Das of Sybase
 
2012.04.26 big insights streams im forum2
2012.04.26 big insights streams im forum22012.04.26 big insights streams im forum2
2012.04.26 big insights streams im forum2
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data Solutions
 
Dev Lakhani, Data Scientist at Batch Insights "Real Time Big Data Applicatio...
Dev Lakhani, Data Scientist at Batch Insights  "Real Time Big Data Applicatio...Dev Lakhani, Data Scientist at Batch Insights  "Real Time Big Data Applicatio...
Dev Lakhani, Data Scientist at Batch Insights "Real Time Big Data Applicatio...
 
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with HadoopBig Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
Big Data 2.0: YARN Enablement for Distributed ETL & SQL with Hadoop
 
O2 060814
O2 060814O2 060814
O2 060814
 
Pitfalls of Data Warehousing_2019-04-24
Pitfalls of Data Warehousing_2019-04-24Pitfalls of Data Warehousing_2019-04-24
Pitfalls of Data Warehousing_2019-04-24
 
Thu-310pm-Impetus-SachinAndAjay
Thu-310pm-Impetus-SachinAndAjayThu-310pm-Impetus-SachinAndAjay
Thu-310pm-Impetus-SachinAndAjay
 
Four Problems You Run into When DIY-ing a “Big Data” Analytics System
Four Problems You Run into When DIY-ing a “Big Data” Analytics SystemFour Problems You Run into When DIY-ing a “Big Data” Analytics System
Four Problems You Run into When DIY-ing a “Big Data” Analytics System
 
BigData & CDN - OOP2011 (Pavlo Baron)
BigData & CDN - OOP2011 (Pavlo Baron)BigData & CDN - OOP2011 (Pavlo Baron)
BigData & CDN - OOP2011 (Pavlo Baron)
 
Data architecture for modern enterprise
Data architecture for modern enterpriseData architecture for modern enterprise
Data architecture for modern enterprise
 
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
 
Big Data: An Overview
Big Data: An OverviewBig Data: An Overview
Big Data: An Overview
 
Database Shootout: What's best for BI?
Database Shootout: What's best for BI?Database Shootout: What's best for BI?
Database Shootout: What's best for BI?
 
Big Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RKBig Data Practice_Planning_steps_RK
Big Data Practice_Planning_steps_RK
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
 
Oncrawl elasticsearch meetup france #12
Oncrawl elasticsearch meetup france #12Oncrawl elasticsearch meetup france #12
Oncrawl elasticsearch meetup france #12
 

Viewers also liked

How to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersHow to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersVoltDB
 
Scaling ETL with Hadoop - Avoiding Failure
Scaling ETL with Hadoop - Avoiding FailureScaling ETL with Hadoop - Avoiding Failure
Scaling ETL with Hadoop - Avoiding FailureGwen (Chen) Shapira
 
Modeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and SparkModeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and SparkSingleStore
 
The State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and ScaleThe State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and ScaleVoltDB
 
AWS Summit Bogotá Track Avanzado: Arquitecturas y mejores practicas de big da...
AWS Summit Bogotá Track Avanzado: Arquitecturas y mejores practicas de big da...AWS Summit Bogotá Track Avanzado: Arquitecturas y mejores practicas de big da...
AWS Summit Bogotá Track Avanzado: Arquitecturas y mejores practicas de big da...Amazon Web Services
 
Real-time Analytics with Cassandra, Spark, and Shark
Real-time Analytics with Cassandra, Spark, and SharkReal-time Analytics with Cassandra, Spark, and Shark
Real-time Analytics with Cassandra, Spark, and SharkEvan Chan
 
Data Engineering with Spring, Hadoop and Hive
Data Engineering with Spring, Hadoop and Hive	Data Engineering with Spring, Hadoop and Hive
Data Engineering with Spring, Hadoop and Hive Alex Silva
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoptionHortonworks
 
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...Cloudera, Inc.
 
Big data: Loading your data with flume and sqoop
Big data:  Loading your data with flume and sqoopBig data:  Loading your data with flume and sqoop
Big data: Loading your data with flume and sqoopChristophe Marchal
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon)...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon)...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon)...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon)...Data Con LA
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprisesmarkgrover
 
Tachyon and Apache Spark
Tachyon and Apache SparkTachyon and Apache Spark
Tachyon and Apache Sparkrhatr
 
Alluxio: Unify Data at Memory Speed; 2016-11-18
Alluxio: Unify Data at Memory Speed; 2016-11-18Alluxio: Unify Data at Memory Speed; 2016-11-18
Alluxio: Unify Data at Memory Speed; 2016-11-18Alluxio, Inc.
 
Schema-on-Read vs Schema-on-Write
Schema-on-Read vs Schema-on-WriteSchema-on-Read vs Schema-on-Write
Schema-on-Read vs Schema-on-WriteAmr Awadallah
 
SAS and Cloudera – Analytics at Scale
SAS and Cloudera – Analytics at ScaleSAS and Cloudera – Analytics at Scale
SAS and Cloudera – Analytics at ScaleCloudera, Inc.
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksHortonworks
 
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)Spark Summit
 
ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spar...
ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spar...ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spar...
ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spar...Alluxio, Inc.
 
A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 DataWorks Summit
 

Viewers also liked (20)

How to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top ContendersHow to Build Fast Data Applications: Evaluating the Top Contenders
How to Build Fast Data Applications: Evaluating the Top Contenders
 
Scaling ETL with Hadoop - Avoiding Failure
Scaling ETL with Hadoop - Avoiding FailureScaling ETL with Hadoop - Avoiding Failure
Scaling ETL with Hadoop - Avoiding Failure
 
Modeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and SparkModeling the Smart and Connected City of the Future with Kafka and Spark
Modeling the Smart and Connected City of the Future with Kafka and Spark
 
The State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and ScaleThe State of Streaming Analytics: The Need for Speed and Scale
The State of Streaming Analytics: The Need for Speed and Scale
 
AWS Summit Bogotá Track Avanzado: Arquitecturas y mejores practicas de big da...
AWS Summit Bogotá Track Avanzado: Arquitecturas y mejores practicas de big da...AWS Summit Bogotá Track Avanzado: Arquitecturas y mejores practicas de big da...
AWS Summit Bogotá Track Avanzado: Arquitecturas y mejores practicas de big da...
 
Real-time Analytics with Cassandra, Spark, and Shark
Real-time Analytics with Cassandra, Spark, and SharkReal-time Analytics with Cassandra, Spark, and Shark
Real-time Analytics with Cassandra, Spark, and Shark
 
Data Engineering with Spring, Hadoop and Hive
Data Engineering with Spring, Hadoop and Hive	Data Engineering with Spring, Hadoop and Hive
Data Engineering with Spring, Hadoop and Hive
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
 
Big data: Loading your data with flume and sqoop
Big data:  Loading your data with flume and sqoopBig data:  Loading your data with flume and sqoop
Big data: Loading your data with flume and sqoop
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon)...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon)...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon)...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Alluxio (formerly Tachyon)...
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
 
Tachyon and Apache Spark
Tachyon and Apache SparkTachyon and Apache Spark
Tachyon and Apache Spark
 
Alluxio: Unify Data at Memory Speed; 2016-11-18
Alluxio: Unify Data at Memory Speed; 2016-11-18Alluxio: Unify Data at Memory Speed; 2016-11-18
Alluxio: Unify Data at Memory Speed; 2016-11-18
 
Schema-on-Read vs Schema-on-Write
Schema-on-Read vs Schema-on-WriteSchema-on-Read vs Schema-on-Write
Schema-on-Read vs Schema-on-Write
 
SAS and Cloudera – Analytics at Scale
SAS and Cloudera – Analytics at ScaleSAS and Cloudera – Analytics at Scale
SAS and Cloudera – Analytics at Scale
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
 
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
Sqoop on Spark for Data Ingestion-(Veena Basavaraj and Vinoth Chandar, Uber)
 
ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spar...
ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spar...ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spar...
ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spar...
 
A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0 A Reference Architecture for ETL 2.0
A Reference Architecture for ETL 2.0
 

Similar to Streaming Hadoop for Enterprise Adoption

Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureOdinot Stanislas
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)Ajay Ohri
 
Data mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsData mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsGDi Techno Solutions
 
Analyzing Multi-Structured Data
Analyzing Multi-Structured DataAnalyzing Multi-Structured Data
Analyzing Multi-Structured DataDataWorks Summit
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendCaserta
 
Davis mark advanced search analytics in 20 minutes
Davis mark   advanced search analytics in 20 minutesDavis mark   advanced search analytics in 20 minutes
Davis mark advanced search analytics in 20 minutesLucidworks (Archived)
 
Don't be Hadooped when looking for Big Data ROI
Don't be Hadooped when looking for Big Data ROIDon't be Hadooped when looking for Big Data ROI
Don't be Hadooped when looking for Big Data ROIDataWorks Summit
 
Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Etu Solution
 
2011 - TDWI Big Data Forum - The New Analytics
2011 - TDWI Big Data Forum - The New Analytics 2011 - TDWI Big Data Forum - The New Analytics
2011 - TDWI Big Data Forum - The New Analytics Casey Kiernan
 
Big Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQLBig Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQLTugdual Grall
 
New Analytical Architectures for Big Data
New Analytical Architectures for Big DataNew Analytical Architectures for Big Data
New Analytical Architectures for Big DataCasey Kiernan
 
Social Business in a World of Abundant Real-time Data
Social Business in a World of Abundant Real-time DataSocial Business in a World of Abundant Real-time Data
Social Business in a World of Abundant Real-time DataLee Bryant
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingm_hepburn
 
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase Sybase Türkiye
 

Similar to Streaming Hadoop for Enterprise Adoption (20)

Big Data and Implications on Platform Architecture
Big Data and Implications on Platform ArchitectureBig Data and Implications on Platform Architecture
Big Data and Implications on Platform Architecture
 
Kurukshetra - Big Data
Kurukshetra - Big DataKurukshetra - Big Data
Kurukshetra - Big Data
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
 
Data mining - GDi Techno Solutions
Data mining - GDi Techno SolutionsData mining - GDi Techno Solutions
Data mining - GDi Techno Solutions
 
Analyzing Multi-Structured Data
Analyzing Multi-Structured DataAnalyzing Multi-Structured Data
Analyzing Multi-Structured Data
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
 
Davis mark advanced search analytics in 20 minutes
Davis mark   advanced search analytics in 20 minutesDavis mark   advanced search analytics in 20 minutes
Davis mark advanced search analytics in 20 minutes
 
Introducing Splunk – The Big Data Engine
Introducing Splunk – The Big Data EngineIntroducing Splunk – The Big Data Engine
Introducing Splunk – The Big Data Engine
 
Data mining
Data miningData mining
Data mining
 
Don't be Hadooped when looking for Big Data ROI
Don't be Hadooped when looking for Big Data ROIDon't be Hadooped when looking for Big Data ROI
Don't be Hadooped when looking for Big Data ROI
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案
 
Galaxy of bits
Galaxy of bitsGalaxy of bits
Galaxy of bits
 
2011 - TDWI Big Data Forum - The New Analytics
2011 - TDWI Big Data Forum - The New Analytics 2011 - TDWI Big Data Forum - The New Analytics
2011 - TDWI Big Data Forum - The New Analytics
 
Big Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQLBig Data Paris : Hadoop and NoSQL
Big Data Paris : Hadoop and NoSQL
 
New Analytical Architectures for Big Data
New Analytical Architectures for Big DataNew Analytical Architectures for Big Data
New Analytical Architectures for Big Data
 
Social Business in a World of Abundant Real-time Data
Social Business in a World of Abundant Real-time DataSocial Business in a World of Abundant Real-time Data
Social Business in a World of Abundant Real-time Data
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
 
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
Farklı Ortamlarda Büyük Veri Kavramı -Big Data by Sybase
 
Treasure Data and Heroku
Treasure Data and HerokuTreasure Data and Heroku
Treasure Data and Heroku
 

More from DATAVERSITY

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...DATAVERSITY
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceDATAVERSITY
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data LiteracyDATAVERSITY
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsDATAVERSITY
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for YouDATAVERSITY
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?DATAVERSITY
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?DATAVERSITY
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling FundamentalsDATAVERSITY
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 

More from DATAVERSITY (20)

Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 
Exploring Levels of Data Literacy
Exploring Levels of Data LiteracyExploring Levels of Data Literacy
Exploring Levels of Data Literacy
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Make Data Work for You
Make Data Work for YouMake Data Work for You
Make Data Work for You
 
Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?Data Catalogs Are the Answer – What is the Question?
Data Catalogs Are the Answer – What is the Question?
 
Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?Data Catalogs Are the Answer – What Is the Question?
Data Catalogs Are the Answer – What Is the Question?
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 

Recently uploaded

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 

Recently uploaded (20)

H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 

Streaming Hadoop for Enterprise Adoption

  • 1. the open source database you’ll never outgrow VoltDB Streamlining Hadoop for Enterprise Adoption September 2012 Mark Hydar Market Technology and Strategy
  • 2. Agenda  “Big Data” and the Data Landscape  Our Thoughts on Data Pipelines  VoltDB Streaming Overview  Addressing the Topics + Hadoop is too complex and expensive for mainstream enterprises. + It’s taking too long to find useful insights amid an ocean of low quality, disconnected data. + How can my organization reduce costs and mitigate data risks? + How can I gain quicker access to operational insights? + What can I do to improve data quality and reduce total pipeline processing times?  Q&A VoltDB 2
  • 3. What is “Big Data”? Velocity = Big Data = + Volume + Variety  The old equation of Big Data big data = volume = warehouse (OLAP)  This has changed big data = velocity + volume = transactions (OLTP) + warehouse (OLAP)  Big Data is fast and deep  As it arrives, you probably do something with it (or wish you could) you may just want to slow it down! VoltDB 3
  • 4. Data Value Chain Value of Aggregate Individual Data Value Data Item Data Value Age of Data Interactive Real time Analytics Record Lookup Historical Analytics Exploratory Analytics Milliseconds Hundredths of seconds Second(s) Minutes Hours . Place trade . Calculate risk . Retrieve click stream . Backtest algo . Algo discovery . Serve ad . Leaderboard . Show orders . Business Intelligence . Log analysis . Enrich stream . Aggregate . Daily reports . Fraud pattern match . Examine packet . Count . Approve trans. VoltDB 4
  • 5. Database Landscape Fast Value of Complex Individual Aggregate Large Data Item Data Value Application Complexity VoltDB Hadoop, etc. NoSQL Data Value Data Warehouse NewSQL Traditional RDBMS Simple Slow Small Transactional Analytic Interactive Real time Analytics Record Lookup Historical Analytics Exploratory Analytics VoltDB 5
  • 6. Lifecycle for Big Data trades clicks logins authorizations impressions sensors orders  Make the most informed decision every time there Real-time Decision Making is an interaction  Real-time decisions are Who: Automated informed by operational What: Transact, analytics & past Operational Analytics knowledge  Sometimes called OLTP Reports & Analytics Knowledge Who: Analyst “It is not enough to capture massive What: Reports, Drill down, … amounts of data; organizations must also sift through the data, extract information and transform it into Exploratory Analysis actionable knowledge.” Knowledge Who: Data Scientist What: Discover trends, rules, … VoltDB 6
  • 7. The Value Available in Fast Data  Data-driven decisions in real-time  Better decisions by using more information sources  Faster decisions  Insights into real-time operational analytics If I could “Access” to my data sooner, I would be more Insightful than the next guy. -What your competition is thinking right now VoltDB 7
  • 8. The Typical Hadoop Data Pipeline Complex, Hard Slow, Expensive, to Do and Lacks Data Aged Data Expensive Visibility STRUCTURED • Social Media • Marketing Campaigns • Customer Profiles • Account Transactions • Special Offers • Sensor Data Un-Structured Semi-Structured • Web Logs • JSON • Streaming Technologies Ingest Parse & Prepare Transform and Clean Extract and Load VoltDB 8
  • 9. The VoltDB Database • High-performance VoltDB Performance Advantage RDBMS TPC-C single node (Oracle) 45 X • In-memory database TPC-C single node (MySQL) 100 X • Automatic scale-out on commodity servers Cost Disruption Ex. Traditional • Built-in high availability RDBMS VoltDB SPARC 18 , 8-core Intel • Relational structures, System SuperCluster/Oracle 11g servers ACID and SQL Price/tpmC $ 1.01 $0.012 is Faster, Better, Cheaper than the competition VoltDB 9
  • 10. How VoltDB is Used  High throughput, relentless data feeds  Fast operations on high value data  Real-time analytics present immediate visibility Data Feed Real-time Real-time Operations Examine packet by Analytics Identify bandwidth Network Traffic Monitoring Network packets source / destination outliers Recall post trade Financial Trade Support Market orders Ingest trade data order groupings Identification and Notification and Sensor tracking & analytics Sensor position feed cleansing of tag info groupings Game state updates Mobile Gaming Online game Leaderboard lookups and usage patterns Report ad Digital Ad Tech Ad bid / click stream Bid, optimize content performance VoltDB 10
  • 11. Why Address the Streaming Gap  VoltDB Hadoop Data Streaming + Real Time Business Decisioning + Data Quality and Enrichment + Simplifies Data Integration + Shortens Time to Market  Increasing Productivity + Common Development and Data Environment — Provides Reusability (data flows and computations) — Provides Universal Access to Real Time Data — Uses well known data access utilities VoltDB 11
  • 12. Hadoop Integration  Motivation + Big Data = high velocity (VoltDB) + high volume (Hadoop) — VoltDB ingests fire hose, manages state, supports real-time analytics, spools to Hadoop — Hadoop imports from VoltDB (via Sqoop)  Technologies + VoltDB Export — Real-time streaming export — Data consolidation, aggregation, enrichment — Buffering and overflow to eliminate “impedance mismatches” — Bi-directional durability + Sqoop — Cloudera-authored DBMS=>HDFS importer — Pull-based technology VoltDB 12
  • 13. The Optimized Data Pipeline Repeatable Real Time Reduced Business Integration Model Business Infrastructure Analytics (Reduced Complexity) Intelligence (Predictability) (Warehousing) STRUCTURED • Social Media • Marketing Campaigns • Customer Profiles • Account Real Time Transactions Analytics • Special Offers • Sensor Data • SQL/Relational • ACID • Scalable Un-Structured • Data Enrichment Semi-Structured • Web Logs • JSON • Streaming Technologies Ingest Parse & Prepare Transform and Clean Extract and Load VoltDB 13
  • 14. VoltDB in the Big Data Landscape (Today) Online gaming Ad serving Sensor data SQL/Relational Financial ACID trade High Performance Internet Scalable commerce Highly Available SaaS, Benchmarked at Web 2.0 3+ million ACID TPS Mobile platforms VoltDB 14
  • 15. The Close  As Hadoop adoption increases, it has become evident that programming Hadoop is too complex and expensive for mainstream enterprises.  It’s taking too long to find useful insights amid an ocean of low quality, disconnected data. How do I address the key barriers to Hadoop adoption?  How can organizations reduce costs and mitigate data risks?  How can they gain quicker access to operational insights?  What can I do to improve data quality and reduce total pipeline processing times?  Did we explore strategies that leading organizations are using to streamline Hadoop processing and eliminate adoption challenges? VoltDB 15
  • 16. Questions? email mhydar@voltdb.com twitter @mhydar Download the VoltDB Enterprise Edition Trial http://voltdb.com/products-services/downloads Join the VoltDB Community http://community.voltdb.com More information on VoltDB Blog http://voltdb.com/company/blog Follow @VoltDB on twitter VoltDB 16