The document discusses building a big data analytics strategy in 3 main steps: 1) Gather requirements and objectives to determine a candidate strategy, 2) Select appropriate tools and technology to implement the strategy, and 3) Implement the strategy through operational readiness. It also covers key concepts like the 3V's model of big data, the big data analytics lifecycle, and strategy considerations at each phase like volume, variety and velocity of data. Example case studies of social media analytics on Hadoop are provided.
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
Building Your Big Data Analytics Strategy- Impetus Webinar
1. Building Your Big Data
Analytics Strategy: Block by
Block
@impetuscalling
Recorded version available at 1
http://www.impetus.com/webinar_registration?event=archived&eid=53
2. Outline
Building a Big Data Strategy
Big Data & 3V’s
3V s
3 V’s model
Big Data Analytics Lifecycle
Strategy Selection
Technology Selection
Hadoop E
H d Ecosystem
t
Alternative
Putting it Together
g g
Case Studies and Applications
Q&A’s
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 2
3. Building a Big Data Strategy
Gather Requirements
What needs to be done?
Requirements
Objectives
Choose Candidate Strategy Options Candidate
Strategy
Patterns & Best Practices Selection
Choose Tools and Technology Tools &
Technology
Selection
Implementation
I l i
Operational Readiness Implementation
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 3
4. Big Data & 3V’s Model
What is Big Data?
Define by size or volume
or by ‘breakdown’
3V’s model
Variety of Data
Volume of D t
V l f Data
Velocity of Data
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 4
5. Big Data Analytics Life Cycle
Ingestion Visualization
Creation Analysis
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 5
6. BIG Data Analytics Life Cycle: Concerns
Ingestion Visualization
• Storage • Tools &
• Elasticity Technologies
• Integrations • Testing • Channels
• Monitoring
• Tools & • Pre Built • In Memory
• Compression
p Technologies Support
Solutions
• Standardization • Standardization
Creation Analysis
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 6
7. BIG Data Analytics Life Cycle & 3V’s
Simple and potent tool to analyze strategy requirements
Answer simple questions of how much what type and at what rate
much,
Applicable to each phase
Using matrix to select suitable strategy
g gy
Dictates the potential choice of solutions, tools & technologies
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 7
8. BIG Data Analytics Life Cycle & 3V’s
How M h?
H Much? What T
Wh t Type?
? What R t ?
Wh t Rate?
Creation
• Storage
• Elasticity
• Monitoring
• Compression
Ingestion
Analysis
Visualization
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 8
9. Strategy Selection
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 9
10. Big Data Analytics Strategy
Creation Ingestion Analysis Visualization
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 10
11. Technology Selection
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 11
12. The Hadoop Ecosystem
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 12
13. Alternate/Emerging Options
Making stuff Faster
Pervasive Datarush Hstreaming
Datarush,
Cloud Map Reduce
HPCC, Datastax Brisk, Platform Computing
MARS, GPMR
Major MPPs-in-database MR-Oracle, Aster etc
Hadapt
NOSQL
Cassandra, MongoDB, Hbase
Riak
Redis
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 13
14. Alternate/Emerging Options
Graph Type DB’s
Neo4j
HyperGraphDB
InfiniteGraph
Pregel
Trinity
Faster SQL DB’s
DB s
VoltDB, Clustrix
Hardware + Software Solutions
Exadata , Parstream
Virtualized Options of Hardware + Software Solutions such as
e ou d
Xeround
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 14
15. Putting it Together
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 15
16. Indirect Analytics over Hadoop
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 16
17. Direct Analytics over Hadoop
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 17
18. Analytics over Hadoop with MPP DW
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 18
19. Case Studies
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 19
20. Social Media Analytics
Problem Statement
Analytics on huge data sets populated from live streaming data
Simplifying services, cost reduction, proactive analysis on
customer’s feedback
Challenges
Live data streaming from social media websites
Clustering
Learn typical comments, demands, questions
Value: Helps identify response / behavior anomalies
Classification
Learn to identify known patterns automatically
Value: useful in filtering, pre-emptive addressing, gaining
customer confidence
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 20
21. Social Media Analytics (cont..)
Approach
Prepare matrix to capture How Much? What Type? What Rate
Much?, Type?,
against each phase
Use big data solution strategy covering all concerns of big data
analytics lifecycle
Solution
Architected a flexible and scalable solution with near real time
streaming of social media d t on d il /h l scheduled j b
t i f i l di data daily/hourly h d l d jobs
Built a solution based on Hadoop, HBase, Hive and Mahout
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 21
22. Solution Overview
Creation Ingestion Analysis Visualization
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 22
23. Summing Up
Creating a matrix to build suitable strategy
Enables creation of a platform or a solution to manage 3Vs of data
Solutions, tools & technologies
Hadoop based Big Data Analytics is a scalable and cost effective
option
Strategy selection
Recorded version available at
Impetus Proprietary http://www.impetus.com/webinar_registration?event=archived&eid=53 23
24. About Us
Strategic partners for software product engineering and R&D
Thought
Tho ght leaders in cutting-edge technologies
c tting edge
Mature processes and practices that are methodical, yet flexible
Diverse domain expertise
Our
O services in Big Data and Analytics
i i Bi D t d A l ti
Expert consulting
Proof-of-concept & Implementation
Support services
Recorded version available at
http://www.impetus.com/webinar_registration?event=archived&eid=53
25. Questions
Please send in your questions
using the chat panel
Recorded version available at
http://www.impetus.com/webinar_registration?event=archived&eid=53 25
26. Thank you
y
For more information,
write to us at inquiry@impetus.com
@impetuscalling
Recorded version available at
http://www.impetus.com/webinar_registration?event=archived&eid=53