Más contenido relacionado La actualidad más candente (20) Similar a Analytic Platforms in the Real World with 451Research and Calpont_July 2012 (20) Analytic Platforms in the Real World with 451Research and Calpont_July 20122. Today’s Presenters
Matt Aslett
• Research Manager,
Data Management and Analytics
• With 451 Research since 2007
• www.twitter.com/maslett
Information Management Commercial Adoption of Open Source
Operational databases (CAOS)
Data warehousing Open source projects
Data caching Adoption of open source software
Event processing Vendor strategies
InfiniDB® Scalable. Fast. Simple. 2 © 2012 Calpont. All Rights Reserved.
3. Today’s Presenters
Bob Wilkinson
• Calpont Vice President of Engineering
• Formerly CTO for Tektronix
Communications
• 16 years of product development
• Responsible for design, development,
and support of InfiniDB ®
InfiniDB® Scalable. Fast. Simple. 3 © 2012 Calpont. All Rights Reserved.
4. Today’s Discussion
• Matt Aslett
o Total Data and the Rise of the Analytic Platform
o Analytic Platforms in the Big Data ecosystem
o Defining the Analytic Platform
• Bob Wilkinson
o InfiniDB Analytic Platform
o InfiniDB in Action
• Telecommunications
• Online Advertising
• Summary and Q&A
InfiniDB® Scalable. Fast. Simple. 4 © 2012 Calpont. All Rights Reserved.
5. Overview
The rise of the analytic platform
What and why
The analytic platform’s place in the ‘big data’ ecosystem
Where and when
The key characteristics of an analytic platform
How and which
5
© 2012 by The 451 Group. All rights reserved
7. Big Data – Implications for Data Management
“Big data” - realization of greater business intelligence by
storing, processing and analyzing data that was previously
ignored due to the limitations of traditional data management
technologies to handle its volume, velocity and/or variety.
Volume Velocity Variety
The volume of data The data is being The data lacks the
is too large for produced at a rate structure to make it
traditional database that is beyond the suitable for storage
software tools to performance limits and analysis in
cope with of traditional traditional databases
systems and data warehouses
© 2012 by The 451 Group. All rights reserved
8. Total Data - Beyond ‘Big Data’
The adoption of non-traditional data processing technologies is
driven not just by the nature of the data, but also by the user’s
particular data processing requirements.
Totality Exploration Frequency Dependency
The desire to process The interest in The desire to The reliance on
and analyze data in exploratory analytic increase the rate of existing technologies
its entirety, rather approaches, in which analysis in order to and skills, and the
than analyzing a schema is defined in generate more need to balance
sample of data and response to the accurate and timely investment in those
extrapolating the nature of the query. business intelligence. existing technologies
results. and skills with the
adoption of new
techniques.
© 2012 by The 451 Group. All rights reserved
9. Beyond the limitations of traditional data warehousing
The EDW is supposed to be a single source of the ‘truth’ and avoid
data silos.
One of the most significant inefficiencies of data warehousing is
that users have traditionally had to design their data-warehouse
models to match their planned queries.
This approach is too rigid in a world of rapidly changing business
requirements and real-time decision-making
And its inflexibility serves to encourage the growth of data silos and
the exact redundancy and duplication issues the EDW was
apparently designed to avoid.
A business analyst or executive unable to get the answers to queries
they require from the EDW is likely to find their own ways to answer
these queries.
© 2012 by The 451 Group. All rights reserved
10. The Rise of Specialist Platforms
The alternative is to embrace dispersed data, adopting not silos but
specialist data platforms, that complement the EDW.
‘Total Data’ describes an approach that treats the various data
management components as an integrated whole.
eBay is a prime example of this approach in action, with its
Singularity analytic platform, as well as an EDW and Hadoop.
Structured SQL analysis Semi-structured SQL Unstructured analysis
© 2012 by The 451 Group. All rights reserved
11. Defining “Analytic Platform”
Enterprises have used specialist data marts/warehouses for many
years for departmental/application-specific use-cases.
Analytic platforms are designed to enable different analytic
approaches, that complement traditional EDW workloads.
Large data volumes
Raw/close-to-raw data
Multiple dimensions
Complex variables
Near real-time requirements
Columnar storage
SQL, user-defined functions
MapReduce
In-database analytics
Flexible schema
© 2012 by The 451 Group. All rights reserved
12. Flexible schema
Apply structural patterns as the data is analyzed, rather than when
it is loaded into the database.
Query
Schema on write
Application Schema Data storage Results
Schema on read Query
Application Data storage Schema Results
© 2012 by The 451 Group. All rights reserved
13. “Exploratory Analytic Platform”
The need for EAPs is not necessarily driven by the choice of storage
platform (e.g., Hadoop or analytic database) or query language
(e.g., SQL or MapReduce).
Instead it is driven by the nature of the query or workload, or the
skills and tools employed by the person interacting with the data.
While data analysts are analyzing data to find answers to existing
questions, data scientists are exploring patterns in data to prompt
new questions.
E.g. customer analysis, interactive marketing, targeted advertising,
churn analysis, sentiment analysis, fraud analysis.
An EAP should be flexible enough to enable the use of multiple
techniques to support exploratory analysis.
© 2012 by The 451 Group. All rights reserved
14. EAP in larger Total Data landscape
EDW retains core role for
stable schema and
structured SQL analytics
on ERP, CRM apps etc.
Hadoop for storage and
processing of raw data,
analysis of unstructured,
schemaless data.
EAP for flexible,
exploratory analytics on
rapidly updated data with
evolving schema.
© 2012 by The 451 Group. All rights reserved
15. The Spectrum of Analytic Approaches
Integration enables a ‘total data’ approach that treats the various
platforms as points on a spectrum depending on the rigidity and
importance of schema, rather than individual silos.
© 2012 by The 451 Group. All rights reserved
16. The Spectrum of Analytic Approaches
Integration enables a ‘total data’ approach that treats the various
platforms as points on a spectrum depending on the rigidity and
importance of schema, rather than individual silos.
© 2012 by The 451 Group. All rights reserved
17. The Spectrum of Analytic Approaches
Integration enables a ‘total data’ approach that treats the various
platforms as points on a spectrum depending on the rigidity and
importance of schema, rather than individual silos.
Calpont InfiniDB
• Columnar MPP
• Vertical and horizontal range partitioning
• Integrated MapReduce
• Distributed user-defined functions
© 2012 by The 451 Group. All rights reserved
18. Considerations for Deploying an Analytic Platform
Scalability – the ability to handle large volumes of data and expand
as data volumes grow
Performance – high performance processing is required to deliver
rapid results
Efficiency – in-database analytics approaches that take the query to
the data
Flexibility – no reliance on restrictive schema to deliver the desired
performance
Variability – support for multiple query approaches and advanced
functions to enable exploratory analysis
© 2012 by The 451 Group. All rights reserved
19. Calpont Corporation
Calpont Mission
To provide a highly
scalable data
platform that enables
analytic business
decisions as timely
• Software Company as customers and
markets dictate.
• High Perf/ HA Analytic Data Platform
• Dallas HQ, Silicon Valley
• Partners in North America, Europe, Japan
• Online Media, Digital Networks, Telco
20. What is InfiniDB?
Simple, Powerful Platform for Big Data Analytics
Columnar Performance Efficiency
Widely used MySQL Interface
MPP, MapReduce style Query Execution
20
InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
21. Benefits of InfiniDB
Real-time, Consistent Query Performance
Linear Scale for Massive Data
Removes Limits to Dimensions and Granularity
Easy to Deploy and Maintain
InfiniDB® Scalable. Fast. Simple. 21 © 2012 Calpont. All Rights Reserved.
22. InfiniDB Analytic Platform – DW and Exploration
Analytic Needs Analytic Platform Data Integration Big Data Sources
Data Warehouse
ETL
Transactional
Dimensional
Analytics Hadoop
Operational
Analytic Data MDM
Data Discovery
Store
Legacy
Direct Load Model RDBMS
Predictive
Analytics
InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
24. Telecommunications Market Challenges
Global Mobile Voice and
Data Revenues/ARPU – 2007-2013
Macro Drivers:
• Subscriber Growth declining
Voice
Revenue
US $ Millions per Year
Data
Revenue • ARPU declining
Total ARPU
• Revenue Growth vs. Cost to
Carry
Source: Informa Telecoms & Media
Do carriers?
• Attempt to control costs via
throttling, etc.
• Increase revenue through
monetization strategies
7/18/2012
InfiniDB® Scalable. Fast. Simple. 24 © 2012 Calpont. All Rights Reserved.
25. The Telco Gold Mine
Quality
• Meets CSP expectations?
• Meets Subscriber expectations?
Data Sources
• Element feeds
Telco data is
• Probe feeds
rich – Can it be • Device agents
fully leveraged? • Log files
• Care data
Usage Location
• What applications/services? • Where are they?
• How much, how long, etc. • Movement patterns, etc.
InfiniDB® Scalable. Fast. Simple. 25 © 2012 Calpont. All Rights Reserved.
26. Challenge? or Opportunity?
Multi-Dimensional Analysis
Dimensions service application
Linkage?
network customer
kpi
InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
27. Telco Success
Representative data from Customer Experience (CEM) analytics :
Legacy InfiniDB Improvement
# of DRs 15 billion 15 billion n/a
Database size 4 TB < 1TB (75%)
Load rates 30k/sec >120K/sec 400%
Typical analytics 300 sec. 5 sec. (98%)
query
Benefits
Game-changer for storage of and access to non-aggregated data
Near linear scale out performance
InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
29. Online Advertising – Market Challenges
• Advertising Analytics (≠ Web Analytics)
o Interactions and performance of ads on other sites
o Attribution analysis - ad optimization, efficient targeting,
and return on ad spend
• Challenges
o Massive daily data consumption – “Billions Served”
o Ad targeting is not real-time with traditional data tech
o Attribution analytics effectiveness
Wide Dimensionality Granularity
InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
30. Mobile Advertising – Analytic Data Environment
Info Sources
Source Data
Location Ads
Free WiFi Ad Share
ETL Analytic Platform BI / Analytic Front End
WiFi Captive Display
Special Needs
Latitudinal / Longitudinal
Geospatial Functions
Military Grid Ref System
App Embedded Ads
(MGRS) Functions
Non-Calpont product names are trademarks of their respective owners
InfiniDB® Scalable. Fast. Simple. 30 © 2012 Calpont. All Rights Reserved.
31. Online Advertising Success
Location-based Mobile Advertiser Funnels Big Data Insights
Legacy InfiniDB Improvement
# of DRs 300 Million 300 Million n/a
Database size >6 TB 3 TB (50%)
Load rates 100k/sec 1M+/sec 1000%
Typical analytics 20-30 min with 15 sec. (99.2%)
query cubes
Benefits
Mobile Audience Insights Report
Real-time analytics about niche segments
Simple MySQL interface for easy use of Hadoop ETL
extracts
“Mobile Audience Insights” for segment affinity and
engagement strategies
InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
32. Key Takeaways
A spectrum of analytic platforms address structured and
unstructured needs that complement the traditional EDW
Proper choice of an analytics platform should depend on rigidity
and importance of schema, as well as skills and tools of users
InfiniDB is a scalable MPP columnar platform supporting
exploratory analytics for structured data
Calpont is helping partners create transformational solutions in
Telco Customer Experience and Online Advertising
InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.
33. More Info on 451 Research and Calpont
Matt Aslett Bob Wilkinson
451 Research Calpont Corporation
www.451research.com www.calpont.com
@maslett @451research @Calpont, @InfiniDB
451 examines trends behind Big Data and Calpont discusses why Big Data in online
the Total Data management approach marketing needs modern data technology
InfiniDB® Scalable. Fast. Simple. 33 © 2012 Calpont. All Rights Reserved.