SlideShare una empresa de Scribd logo
1 de 34
Descargar para leer sin conexión
Calpont InfiniDB®
Accelerating Data Insights

                      ®



Where the Rubber Meets the Road –
Analytic Platforms in the Real World

Featuring Matt Aslett, 451Research

July 18, 2012
Today’s Presenters

    Matt Aslett
    • Research Manager,
      Data Management and Analytics
    • With 451 Research since 2007
    • www.twitter.com/maslett

      Information Management            Commercial Adoption of Open Source
       Operational databases           (CAOS)
       Data warehousing                 Open source projects
       Data caching                     Adoption of open source software
       Event processing                 Vendor strategies


InfiniDB® Scalable. Fast. Simple.   2                        © 2012 Calpont. All Rights Reserved.
Today’s Presenters

    Bob Wilkinson
    • Calpont Vice President of Engineering
    • Formerly CTO for Tektronix
      Communications
    • 16 years of product development
    • Responsible for design, development,
      and support of InfiniDB                                             ®




InfiniDB® Scalable. Fast. Simple.   3         © 2012 Calpont. All Rights Reserved.
Today’s Discussion

 • Matt Aslett
         o Total Data and the Rise of the Analytic Platform
         o Analytic Platforms in the Big Data ecosystem
         o Defining the Analytic Platform
 • Bob Wilkinson
         o InfiniDB Analytic Platform
         o InfiniDB in Action
            • Telecommunications
            • Online Advertising
 • Summary and Q&A

InfiniDB® Scalable. Fast. Simple.       4               © 2012 Calpont. All Rights Reserved.
Overview


 The rise of the analytic platform
  What and why

 The analytic platform’s place in the ‘big data’ ecosystem
  Where and when

 The key characteristics of an analytic platform
  How and which



                                                                     5




                      © 2012 by The 451 Group. All rights reserved
The 451 Group




                                                               6




                © 2012 by The 451 Group. All rights reserved
Big Data – Implications for Data Management
 “Big data” - realization of greater business intelligence by
  storing, processing and analyzing data that was previously
  ignored due to the limitations of traditional data management
  technologies to handle its volume, velocity and/or variety.




       Volume                     Velocity                                   Variety
       The volume of data         The data is being                          The data lacks the
       is too large for           produced at a rate                         structure to make it
       traditional database       that is beyond the                         suitable for storage
       software tools to          performance limits                         and analysis in
       cope with                  of traditional                             traditional databases
                                  systems                                    and data warehouses




                              © 2012 by The 451 Group. All rights reserved
Total Data - Beyond ‘Big Data’
 The adoption of non-traditional data processing technologies is
   driven not just by the nature of the data, but also by the user’s
   particular data processing requirements.




Totality                Exploration                       Frequency                Dependency
The desire to process   The interest in                   The desire to            The reliance on
and analyze data in     exploratory analytic              increase the rate of     existing technologies
its entirety, rather    approaches, in which              analysis in order to     and skills, and the
than analyzing a        schema is defined in              generate more            need to balance
sample of data and      response to the                   accurate and timely      investment in those
extrapolating the       nature of the query.              business intelligence.   existing technologies
results.                                                                           and skills with the
                                                                                   adoption of new
                                                                                   techniques.



                                © 2012 by The 451 Group. All rights reserved
Beyond the limitations of traditional data warehousing
 The EDW is supposed to be a single source of the ‘truth’ and avoid
  data silos.

 One of the most significant inefficiencies of data warehousing is
  that users have traditionally had to design their data-warehouse
  models to match their planned queries.

 This approach is too rigid in a world of rapidly changing business
  requirements and real-time decision-making

 And its inflexibility serves to encourage the growth of data silos and
  the exact redundancy and duplication issues the EDW was
  apparently designed to avoid.

 A business analyst or executive unable to get the answers to queries
  they require from the EDW is likely to find their own ways to answer
  these queries.



                         © 2012 by The 451 Group. All rights reserved
The Rise of Specialist Platforms

    The alternative is to embrace dispersed data, adopting not silos but
     specialist data platforms, that complement the EDW.

    ‘Total Data’ describes an approach that treats the various data
     management components as an integrated whole.

    eBay is a prime example of this approach in action, with its
     Singularity analytic platform, as well as an EDW and Hadoop.




Structured SQL analysis    Semi-structured SQL                            Unstructured analysis

                           © 2012 by The 451 Group. All rights reserved
Defining “Analytic Platform”
 Enterprises have used specialist data marts/warehouses for many
    years for departmental/application-specific use-cases.

 Analytic platforms are designed to enable different analytic
    approaches, that complement traditional EDW workloads.

   Large data volumes
   Raw/close-to-raw data
   Multiple dimensions
   Complex variables
   Near real-time requirements
   Columnar storage
   SQL, user-defined functions
   MapReduce
   In-database analytics
   Flexible schema



                          © 2012 by The 451 Group. All rights reserved
Flexible schema
 Apply structural patterns as the data is analyzed, rather than when
  it is loaded into the database.

                         Query
Schema on write

    Application          Schema                             Data storage   Results




Schema on read                                                    Query


    Application        Data storage                             Schema     Results




                          © 2012 by The 451 Group. All rights reserved
“Exploratory Analytic Platform”
 The need for EAPs is not necessarily driven by the choice of storage
  platform (e.g., Hadoop or analytic database) or query language
  (e.g., SQL or MapReduce).

 Instead it is driven by the nature of the query or workload, or the
  skills and tools employed by the person interacting with the data.

 While data analysts are analyzing data to find answers to existing
  questions, data scientists are exploring patterns in data to prompt
  new questions.

 E.g. customer analysis, interactive marketing, targeted advertising,
  churn analysis, sentiment analysis, fraud analysis.

 An EAP should be flexible enough to enable the use of multiple
  techniques to support exploratory analysis.


                         © 2012 by The 451 Group. All rights reserved
EAP in larger Total Data landscape
                                                            EDW retains core role for
                                                               stable schema and
                                                               structured SQL analytics
                                                               on ERP, CRM apps etc.


                                                            Hadoop for storage and
                                                               processing of raw data,
                                                               analysis of unstructured,
                                                               schemaless data.

                                                            EAP for flexible,
                                                               exploratory analytics on
                                                               rapidly updated data with
                                                               evolving schema.


                     © 2012 by The 451 Group. All rights reserved
The Spectrum of Analytic Approaches

 Integration enables a ‘total data’ approach that treats the various
  platforms as points on a spectrum depending on the rigidity and
  importance of schema, rather than individual silos.




                         © 2012 by The 451 Group. All rights reserved
The Spectrum of Analytic Approaches

 Integration enables a ‘total data’ approach that treats the various
  platforms as points on a spectrum depending on the rigidity and
  importance of schema, rather than individual silos.




                         © 2012 by The 451 Group. All rights reserved
The Spectrum of Analytic Approaches

 Integration enables a ‘total data’ approach that treats the various
  platforms as points on a spectrum depending on the rigidity and
  importance of schema, rather than individual silos.




              Calpont InfiniDB
              • Columnar MPP
              • Vertical and horizontal range partitioning
              • Integrated MapReduce
              • Distributed user-defined functions

                         © 2012 by The 451 Group. All rights reserved
Considerations for Deploying an Analytic Platform
 Scalability – the ability to handle large volumes of data and expand
  as data volumes grow

 Performance – high performance processing is required to deliver
  rapid results

 Efficiency – in-database analytics approaches that take the query to
  the data

 Flexibility – no reliance on restrictive schema to deliver the desired
  performance

 Variability – support for multiple query approaches and advanced
  functions to enable exploratory analysis



                         © 2012 by The 451 Group. All rights reserved
Calpont Corporation
                                             Calpont Mission
                                              To provide a highly
                                                 scalable data
                                             platform that enables
                                               analytic business
                                              decisions as timely
• Software Company                             as customers and
                                                markets dictate.

• High Perf/ HA Analytic Data Platform

• Dallas HQ, Silicon Valley

• Partners in North America, Europe, Japan

• Online Media, Digital Networks, Telco
What is InfiniDB?



                     Simple, Powerful Platform for Big Data Analytics


               Columnar Performance Efficiency
                      Widely used MySQL Interface
      MPP, MapReduce style Query Execution

                                            20




InfiniDB® Scalable. Fast. Simple.                                © 2012 Calpont. All Rights Reserved.
Benefits of InfiniDB



                     Real-time, Consistent Query Performance

                     Linear Scale for Massive Data

                     Removes Limits to Dimensions and Granularity

                     Easy to Deploy and Maintain


InfiniDB® Scalable. Fast. Simple.        21             © 2012 Calpont. All Rights Reserved.
InfiniDB Analytic Platform – DW and Exploration
      Analytic Needs                  Analytic Platform     Data Integration           Big Data Sources



                                    Data Warehouse
                                                                 ETL
                                                                                     Transactional

   Dimensional
     Analytics                                                Hadoop

                                                                                        Operational

                                     Analytic Data             MDM
  Data Discovery
                                        Store


                                                                                          Legacy
                                                          Direct Load Model               RDBMS


       Predictive
        Analytics
InfiniDB® Scalable. Fast. Simple.                                         © 2012 Calpont. All Rights Reserved.
InfiniDB - Telecommunications
Telecommunications Market Challenges
                                          Global Mobile Voice and
                                     Data Revenues/ARPU – 2007-2013
                                                                                    Macro Drivers:
                                                                                    • Subscriber Growth declining
                                                                      Voice
                                                                      Revenue
            US $ Millions per Year




                                                                      Data
                                                                      Revenue       • ARPU declining
                                                                      Total ARPU
                                                                                    • Revenue Growth vs. Cost to
                                                                                      Carry

              Source: Informa Telecoms & Media
                                                                                    Do carriers?
                                                                                    • Attempt to control costs via
                                                                                      throttling, etc.
                                                                                    • Increase revenue through
                                                                                      monetization strategies


7/18/2012
  InfiniDB® Scalable. Fast. Simple.                                            24                     © 2012 Calpont. All Rights Reserved.
The Telco Gold Mine
                           Quality
                           •    Meets CSP expectations?
                           •    Meets Subscriber expectations?
                                                                            Data Sources
                                                                            • Element feeds
  Telco data is
                                                                            • Probe feeds
  rich – Can it be                                                          • Device agents
  fully leveraged?                                                          • Log files
                                                                            • Care data




   Usage                                                         Location
   •    What applications/services?                              •   Where are they?
   •    How much, how long, etc.                                 •   Movement patterns, etc.


InfiniDB® Scalable. Fast. Simple.                         25                         © 2012 Calpont. All Rights Reserved.
Challenge? or Opportunity?
  Multi-Dimensional Analysis


   Dimensions                       service              application




                                              Linkage?
                     network                                           customer




                                              kpi




InfiniDB® Scalable. Fast. Simple.                                             © 2012 Calpont. All Rights Reserved.
Telco Success
         Representative data from Customer Experience (CEM) analytics :
                                    Legacy       InfiniDB     Improvement
         # of DRs                   15 billion   15 billion   n/a
         Database size              4 TB         < 1TB        (75%)
         Load rates                 30k/sec      >120K/sec    400%
         Typical analytics          300 sec.     5 sec.       (98%)
         query


       Benefits
        Game-changer for storage of and access to non-aggregated data
        Near linear scale out performance




InfiniDB® Scalable. Fast. Simple.                                     © 2012 Calpont. All Rights Reserved.
InfiniDB - Online Advertising
Online Advertising – Market Challenges

  • Advertising Analytics (≠ Web Analytics)
          o Interactions and performance of ads on other sites
          o Attribution analysis - ad optimization, efficient targeting,
            and return on ad spend
  • Challenges
          o Massive daily data consumption – “Billions Served”
          o Ad targeting is not real-time with traditional data tech
          o Attribution analytics effectiveness




               Wide Dimensionality                    Granularity

InfiniDB® Scalable. Fast. Simple.                                          © 2012 Calpont. All Rights Reserved.
Mobile Advertising – Analytic Data Environment
        Info Sources
                                    Source Data
      Location Ads




   Free WiFi Ad Share
                                      ETL                                   Analytic Platform                     BI / Analytic Front End



  WiFi Captive Display
                                                               Special Needs
                                                                Latitudinal / Longitudinal
                                                               Geospatial Functions
                                                                Military Grid Ref System
   App Embedded Ads
                                                               (MGRS) Functions


                                            Non-Calpont product names are trademarks of their respective owners

InfiniDB® Scalable. Fast. Simple.                                      30                                         © 2012 Calpont. All Rights Reserved.
Online Advertising Success

         Location-based Mobile Advertiser Funnels Big Data Insights
                                    Legacy           InfiniDB      Improvement
         # of DRs                   300 Million      300 Million   n/a
         Database size              >6 TB            3 TB          (50%)
         Load rates                 100k/sec         1M+/sec       1000%
         Typical analytics          20-30 min with   15 sec.       (99.2%)
         query                      cubes

       Benefits
                                                                   Mobile Audience Insights Report


        Real-time analytics about niche segments
        Simple MySQL interface for easy use of Hadoop ETL
       extracts
       “Mobile Audience Insights” for segment affinity and
       engagement strategies

InfiniDB® Scalable. Fast. Simple.                                          © 2012 Calpont. All Rights Reserved.
Key Takeaways
  A spectrum of analytic platforms address structured and
   unstructured needs that complement the traditional EDW
  Proper choice of an analytics platform should depend on rigidity
   and importance of schema, as well as skills and tools of users
  InfiniDB is a scalable MPP columnar platform supporting
   exploratory analytics for structured data
  Calpont is helping partners create transformational solutions in
   Telco Customer Experience and Online Advertising




InfiniDB® Scalable. Fast. Simple.                      © 2012 Calpont. All Rights Reserved.
More Info on 451 Research and Calpont

   Matt Aslett                                 Bob Wilkinson
   451 Research                                Calpont Corporation
   www.451research.com                         www.calpont.com
   @maslett @451research                       @Calpont, @InfiniDB




    451 examines trends behind Big Data and   Calpont discusses why Big Data in online
      the Total Data management approach      marketing needs modern data technology

InfiniDB® Scalable. Fast. Simple.                33                                © 2012 Calpont. All Rights Reserved.
®

Más contenido relacionado

La actualidad más candente

IOUG93 - Technical Architecture for the Data Warehouse - Presentation
IOUG93 - Technical Architecture for the Data Warehouse - PresentationIOUG93 - Technical Architecture for the Data Warehouse - Presentation
IOUG93 - Technical Architecture for the Data Warehouse - Presentation
David Walker
 
Big Data and Data Virtualization
Big Data and Data VirtualizationBig Data and Data Virtualization
Big Data and Data Virtualization
Kenneth Peeples
 
2012 10 bigdata_overview
2012 10 bigdata_overview2012 10 bigdata_overview
2012 10 bigdata_overview
jdijcks
 
DataStax & 451 Group Webinar - Real NoSQL Applications in the Enterprise Today
DataStax & 451 Group Webinar - Real NoSQL Applications in the Enterprise TodayDataStax & 451 Group Webinar - Real NoSQL Applications in the Enterprise Today
DataStax & 451 Group Webinar - Real NoSQL Applications in the Enterprise Today
DataStax
 
Rajesh Angadi Brochure
Rajesh Angadi Brochure Rajesh Angadi Brochure
Rajesh Angadi Brochure
Rajesh Angadi
 

La actualidad más candente (20)

Webinar - Security and Manageability: Key Criteria in Selecting Enterprise-Gr...
Webinar - Security and Manageability: Key Criteria in Selecting Enterprise-Gr...Webinar - Security and Manageability: Key Criteria in Selecting Enterprise-Gr...
Webinar - Security and Manageability: Key Criteria in Selecting Enterprise-Gr...
 
IOUG93 - Technical Architecture for the Data Warehouse - Presentation
IOUG93 - Technical Architecture for the Data Warehouse - PresentationIOUG93 - Technical Architecture for the Data Warehouse - Presentation
IOUG93 - Technical Architecture for the Data Warehouse - Presentation
 
Big Data SE vs. SE for Big Data
Big Data SE vs. SE for Big DataBig Data SE vs. SE for Big Data
Big Data SE vs. SE for Big Data
 
Big Data and Data Virtualization
Big Data and Data VirtualizationBig Data and Data Virtualization
Big Data and Data Virtualization
 
Ten Pillars of World Class Data Virtualization
Ten Pillars of World Class Data VirtualizationTen Pillars of World Class Data Virtualization
Ten Pillars of World Class Data Virtualization
 
2012 10 bigdata_overview
2012 10 bigdata_overview2012 10 bigdata_overview
2012 10 bigdata_overview
 
Big data and apache hadoop adoption
Big data and apache hadoop adoptionBig data and apache hadoop adoption
Big data and apache hadoop adoption
 
Slides: Relational to NoSQL Migration
Slides: Relational to NoSQL MigrationSlides: Relational to NoSQL Migration
Slides: Relational to NoSQL Migration
 
DataStax & 451 Group Webinar - Real NoSQL Applications in the Enterprise Today
DataStax & 451 Group Webinar - Real NoSQL Applications in the Enterprise TodayDataStax & 451 Group Webinar - Real NoSQL Applications in the Enterprise Today
DataStax & 451 Group Webinar - Real NoSQL Applications in the Enterprise Today
 
The Convergence of Data & Digital: Mapping Out a Cohesive Strategy for Maximu...
The Convergence of Data & Digital: Mapping Out a Cohesive Strategy for Maximu...The Convergence of Data & Digital: Mapping Out a Cohesive Strategy for Maximu...
The Convergence of Data & Digital: Mapping Out a Cohesive Strategy for Maximu...
 
Are You Killing the Benefits of Your Data Lake?
Are You Killing the Benefits of Your Data Lake?Are You Killing the Benefits of Your Data Lake?
Are You Killing the Benefits of Your Data Lake?
 
Data architecture for modern enterprise
Data architecture for modern enterpriseData architecture for modern enterprise
Data architecture for modern enterprise
 
Enabling Cloud Data Integration (EMEA)
Enabling Cloud Data Integration (EMEA)Enabling Cloud Data Integration (EMEA)
Enabling Cloud Data Integration (EMEA)
 
Rajesh Angadi Brochure
Rajesh Angadi Brochure Rajesh Angadi Brochure
Rajesh Angadi Brochure
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesData Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
 
The Emerging Data Lake IT Strategy
The Emerging Data Lake IT StrategyThe Emerging Data Lake IT Strategy
The Emerging Data Lake IT Strategy
 
Developing a Strategy for Data Lake Governance
Developing a Strategy for Data Lake GovernanceDeveloping a Strategy for Data Lake Governance
Developing a Strategy for Data Lake Governance
 
Data lakes
Data lakesData lakes
Data lakes
 

Similar a Analytic Platforms in the Real World with 451Research and Calpont_July 2012

Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing
DataWorks Summit
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
BigMine
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ajay Ohri
 

Similar a Analytic Platforms in the Real World with 451Research and Calpont_July 2012 (20)

The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
The Business Advantage of Hadoop: Lessons from the Field – Cloudera Summer We...
 
What is the Point of Hadoop
What is the Point of HadoopWhat is the Point of Hadoop
What is the Point of Hadoop
 
Analyze This! Best Practices For Big And Fast Data
Analyze This! Best Practices For Big And Fast DataAnalyze This! Best Practices For Big And Fast Data
Analyze This! Best Practices For Big And Fast Data
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data Solutions
 
Investigative Analytics- What's in a Data Scientists Toolbox
Investigative Analytics- What's in a Data Scientists ToolboxInvestigative Analytics- What's in a Data Scientists Toolbox
Investigative Analytics- What's in a Data Scientists Toolbox
 
Derfor skal du bruge en DataLake
Derfor skal du bruge en DataLakeDerfor skal du bruge en DataLake
Derfor skal du bruge en DataLake
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 
All Together Now: A Recipe for Successful Data Governance
All Together Now: A Recipe for Successful Data GovernanceAll Together Now: A Recipe for Successful Data Governance
All Together Now: A Recipe for Successful Data Governance
 
Data lake benefits
Data lake benefitsData lake benefits
Data lake benefits
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal Modernization
 
Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)Advanced Analytics and Machine Learning with Data Virtualization (India)
Advanced Analytics and Machine Learning with Data Virtualization (India)
 
Ibm big data ibm marriage of hadoop and data warehousing
Ibm big dataibm marriage of hadoop and data warehousingIbm big dataibm marriage of hadoop and data warehousing
Ibm big data ibm marriage of hadoop and data warehousing
 
Data lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiryData lake-itweekend-sharif university-vahid amiry
Data lake-itweekend-sharif university-vahid amiry
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
 
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenariosThe Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
The Practice of Big Data - The Hadoop ecosystem explained with usage scenarios
 
A beginners guide to Cloudera Hadoop
A beginners guide to Cloudera HadoopA beginners guide to Cloudera Hadoop
A beginners guide to Cloudera Hadoop
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
 
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)Ibm big data    hadoop summit 2012 james kobielus final 6-13-12(1)
Ibm big data hadoop summit 2012 james kobielus final 6-13-12(1)
 
Simplifying Big Data Analytics for the Business
Simplifying Big Data Analytics for the BusinessSimplifying Big Data Analytics for the Business
Simplifying Big Data Analytics for the Business
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

Analytic Platforms in the Real World with 451Research and Calpont_July 2012

  • 1. Calpont InfiniDB® Accelerating Data Insights ® Where the Rubber Meets the Road – Analytic Platforms in the Real World Featuring Matt Aslett, 451Research July 18, 2012
  • 2. Today’s Presenters Matt Aslett • Research Manager, Data Management and Analytics • With 451 Research since 2007 • www.twitter.com/maslett Information Management Commercial Adoption of Open Source  Operational databases (CAOS)  Data warehousing  Open source projects  Data caching  Adoption of open source software  Event processing  Vendor strategies InfiniDB® Scalable. Fast. Simple. 2 © 2012 Calpont. All Rights Reserved.
  • 3. Today’s Presenters Bob Wilkinson • Calpont Vice President of Engineering • Formerly CTO for Tektronix Communications • 16 years of product development • Responsible for design, development, and support of InfiniDB ® InfiniDB® Scalable. Fast. Simple. 3 © 2012 Calpont. All Rights Reserved.
  • 4. Today’s Discussion • Matt Aslett o Total Data and the Rise of the Analytic Platform o Analytic Platforms in the Big Data ecosystem o Defining the Analytic Platform • Bob Wilkinson o InfiniDB Analytic Platform o InfiniDB in Action • Telecommunications • Online Advertising • Summary and Q&A InfiniDB® Scalable. Fast. Simple. 4 © 2012 Calpont. All Rights Reserved.
  • 5. Overview The rise of the analytic platform  What and why The analytic platform’s place in the ‘big data’ ecosystem  Where and when The key characteristics of an analytic platform  How and which 5 © 2012 by The 451 Group. All rights reserved
  • 6. The 451 Group 6 © 2012 by The 451 Group. All rights reserved
  • 7. Big Data – Implications for Data Management  “Big data” - realization of greater business intelligence by storing, processing and analyzing data that was previously ignored due to the limitations of traditional data management technologies to handle its volume, velocity and/or variety. Volume Velocity Variety The volume of data The data is being The data lacks the is too large for produced at a rate structure to make it traditional database that is beyond the suitable for storage software tools to performance limits and analysis in cope with of traditional traditional databases systems and data warehouses © 2012 by The 451 Group. All rights reserved
  • 8. Total Data - Beyond ‘Big Data’  The adoption of non-traditional data processing technologies is driven not just by the nature of the data, but also by the user’s particular data processing requirements. Totality Exploration Frequency Dependency The desire to process The interest in The desire to The reliance on and analyze data in exploratory analytic increase the rate of existing technologies its entirety, rather approaches, in which analysis in order to and skills, and the than analyzing a schema is defined in generate more need to balance sample of data and response to the accurate and timely investment in those extrapolating the nature of the query. business intelligence. existing technologies results. and skills with the adoption of new techniques. © 2012 by The 451 Group. All rights reserved
  • 9. Beyond the limitations of traditional data warehousing  The EDW is supposed to be a single source of the ‘truth’ and avoid data silos.  One of the most significant inefficiencies of data warehousing is that users have traditionally had to design their data-warehouse models to match their planned queries.  This approach is too rigid in a world of rapidly changing business requirements and real-time decision-making  And its inflexibility serves to encourage the growth of data silos and the exact redundancy and duplication issues the EDW was apparently designed to avoid.  A business analyst or executive unable to get the answers to queries they require from the EDW is likely to find their own ways to answer these queries. © 2012 by The 451 Group. All rights reserved
  • 10. The Rise of Specialist Platforms  The alternative is to embrace dispersed data, adopting not silos but specialist data platforms, that complement the EDW.  ‘Total Data’ describes an approach that treats the various data management components as an integrated whole.  eBay is a prime example of this approach in action, with its Singularity analytic platform, as well as an EDW and Hadoop. Structured SQL analysis Semi-structured SQL Unstructured analysis © 2012 by The 451 Group. All rights reserved
  • 11. Defining “Analytic Platform”  Enterprises have used specialist data marts/warehouses for many years for departmental/application-specific use-cases.  Analytic platforms are designed to enable different analytic approaches, that complement traditional EDW workloads.  Large data volumes  Raw/close-to-raw data  Multiple dimensions  Complex variables  Near real-time requirements  Columnar storage  SQL, user-defined functions  MapReduce  In-database analytics  Flexible schema © 2012 by The 451 Group. All rights reserved
  • 12. Flexible schema  Apply structural patterns as the data is analyzed, rather than when it is loaded into the database. Query Schema on write Application Schema Data storage Results Schema on read Query Application Data storage Schema Results © 2012 by The 451 Group. All rights reserved
  • 13. “Exploratory Analytic Platform”  The need for EAPs is not necessarily driven by the choice of storage platform (e.g., Hadoop or analytic database) or query language (e.g., SQL or MapReduce).  Instead it is driven by the nature of the query or workload, or the skills and tools employed by the person interacting with the data.  While data analysts are analyzing data to find answers to existing questions, data scientists are exploring patterns in data to prompt new questions.  E.g. customer analysis, interactive marketing, targeted advertising, churn analysis, sentiment analysis, fraud analysis.  An EAP should be flexible enough to enable the use of multiple techniques to support exploratory analysis. © 2012 by The 451 Group. All rights reserved
  • 14. EAP in larger Total Data landscape  EDW retains core role for stable schema and structured SQL analytics on ERP, CRM apps etc.  Hadoop for storage and processing of raw data, analysis of unstructured, schemaless data.  EAP for flexible, exploratory analytics on rapidly updated data with evolving schema. © 2012 by The 451 Group. All rights reserved
  • 15. The Spectrum of Analytic Approaches  Integration enables a ‘total data’ approach that treats the various platforms as points on a spectrum depending on the rigidity and importance of schema, rather than individual silos. © 2012 by The 451 Group. All rights reserved
  • 16. The Spectrum of Analytic Approaches  Integration enables a ‘total data’ approach that treats the various platforms as points on a spectrum depending on the rigidity and importance of schema, rather than individual silos. © 2012 by The 451 Group. All rights reserved
  • 17. The Spectrum of Analytic Approaches  Integration enables a ‘total data’ approach that treats the various platforms as points on a spectrum depending on the rigidity and importance of schema, rather than individual silos. Calpont InfiniDB • Columnar MPP • Vertical and horizontal range partitioning • Integrated MapReduce • Distributed user-defined functions © 2012 by The 451 Group. All rights reserved
  • 18. Considerations for Deploying an Analytic Platform  Scalability – the ability to handle large volumes of data and expand as data volumes grow  Performance – high performance processing is required to deliver rapid results  Efficiency – in-database analytics approaches that take the query to the data  Flexibility – no reliance on restrictive schema to deliver the desired performance  Variability – support for multiple query approaches and advanced functions to enable exploratory analysis © 2012 by The 451 Group. All rights reserved
  • 19. Calpont Corporation Calpont Mission To provide a highly scalable data platform that enables analytic business decisions as timely • Software Company as customers and markets dictate. • High Perf/ HA Analytic Data Platform • Dallas HQ, Silicon Valley • Partners in North America, Europe, Japan • Online Media, Digital Networks, Telco
  • 20. What is InfiniDB? Simple, Powerful Platform for Big Data Analytics Columnar Performance Efficiency Widely used MySQL Interface MPP, MapReduce style Query Execution 20 InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
  • 21. Benefits of InfiniDB Real-time, Consistent Query Performance Linear Scale for Massive Data Removes Limits to Dimensions and Granularity Easy to Deploy and Maintain InfiniDB® Scalable. Fast. Simple. 21 © 2012 Calpont. All Rights Reserved.
  • 22. InfiniDB Analytic Platform – DW and Exploration Analytic Needs Analytic Platform Data Integration Big Data Sources Data Warehouse ETL Transactional Dimensional Analytics Hadoop Operational Analytic Data MDM Data Discovery Store Legacy Direct Load Model RDBMS Predictive Analytics InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
  • 24. Telecommunications Market Challenges Global Mobile Voice and Data Revenues/ARPU – 2007-2013 Macro Drivers: • Subscriber Growth declining Voice Revenue US $ Millions per Year Data Revenue • ARPU declining Total ARPU • Revenue Growth vs. Cost to Carry Source: Informa Telecoms & Media Do carriers? • Attempt to control costs via throttling, etc. • Increase revenue through monetization strategies 7/18/2012 InfiniDB® Scalable. Fast. Simple. 24 © 2012 Calpont. All Rights Reserved.
  • 25. The Telco Gold Mine Quality • Meets CSP expectations? • Meets Subscriber expectations? Data Sources • Element feeds Telco data is • Probe feeds rich – Can it be • Device agents fully leveraged? • Log files • Care data Usage Location • What applications/services? • Where are they? • How much, how long, etc. • Movement patterns, etc. InfiniDB® Scalable. Fast. Simple. 25 © 2012 Calpont. All Rights Reserved.
  • 26. Challenge? or Opportunity? Multi-Dimensional Analysis Dimensions service application Linkage? network customer kpi InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
  • 27. Telco Success Representative data from Customer Experience (CEM) analytics : Legacy InfiniDB Improvement # of DRs 15 billion 15 billion n/a Database size 4 TB < 1TB (75%) Load rates 30k/sec >120K/sec 400% Typical analytics 300 sec. 5 sec. (98%) query Benefits  Game-changer for storage of and access to non-aggregated data  Near linear scale out performance InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
  • 28. InfiniDB - Online Advertising
  • 29. Online Advertising – Market Challenges • Advertising Analytics (≠ Web Analytics) o Interactions and performance of ads on other sites o Attribution analysis - ad optimization, efficient targeting, and return on ad spend • Challenges o Massive daily data consumption – “Billions Served” o Ad targeting is not real-time with traditional data tech o Attribution analytics effectiveness Wide Dimensionality Granularity InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
  • 30. Mobile Advertising – Analytic Data Environment Info Sources Source Data Location Ads Free WiFi Ad Share ETL Analytic Platform BI / Analytic Front End WiFi Captive Display Special Needs  Latitudinal / Longitudinal Geospatial Functions  Military Grid Ref System App Embedded Ads (MGRS) Functions Non-Calpont product names are trademarks of their respective owners InfiniDB® Scalable. Fast. Simple. 30 © 2012 Calpont. All Rights Reserved.
  • 31. Online Advertising Success Location-based Mobile Advertiser Funnels Big Data Insights Legacy InfiniDB Improvement # of DRs 300 Million 300 Million n/a Database size >6 TB 3 TB (50%) Load rates 100k/sec 1M+/sec 1000% Typical analytics 20-30 min with 15 sec. (99.2%) query cubes Benefits Mobile Audience Insights Report  Real-time analytics about niche segments  Simple MySQL interface for easy use of Hadoop ETL extracts “Mobile Audience Insights” for segment affinity and engagement strategies InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
  • 32. Key Takeaways A spectrum of analytic platforms address structured and unstructured needs that complement the traditional EDW Proper choice of an analytics platform should depend on rigidity and importance of schema, as well as skills and tools of users InfiniDB is a scalable MPP columnar platform supporting exploratory analytics for structured data Calpont is helping partners create transformational solutions in Telco Customer Experience and Online Advertising InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
  • 33. More Info on 451 Research and Calpont Matt Aslett Bob Wilkinson 451 Research Calpont Corporation www.451research.com www.calpont.com @maslett @451research @Calpont, @InfiniDB 451 examines trends behind Big Data and Calpont discusses why Big Data in online the Total Data management approach marketing needs modern data technology InfiniDB® Scalable. Fast. Simple. 33 © 2012 Calpont. All Rights Reserved.
  • 34. ®