Big Data is Big and it is easy to get lost. If you are interested in a primer on what it is all about and how you can get started on the analytics, this deck will help you scratch the surface.
Grateful 7 speech thanking everyone that has helped.pdf
Big Data and Analytics - Why Should We Care?
1. Big Data & Analytics – So What?
A few answers by Vishwa Kolla
(Prepared for UMass Boston MBA Students)
2. About Vishwa Kolla
Vishwa Kolla
Sr. Consultant, Advanced Analytics & Modeling
Deloitte Consulting, Boston
MBA Carnegie Mellon University
MS University of Denver
BS BITS Pilani, India
Professional Interests Personal Interests
Absolutely love solving a variety of business Most recent interest - watching my 4 year old
problems using advanced, predictive analytical grow (lot of fun and lot of work)
techniques and building decision support systems Volunteering for a non-profit organization to help
as a means it grow and shape the direction of its growth
My engagements typically involve synthesizing Big Outdoor activities – climbing 14ers (peaks over
Data into actionable insights 14,000 ft. high), skiing
Some engagements include: Traveling
Helping F5 firm solve customer attrition Meeting new people
Helping Top 5 professional services firm solve Philosophy – understanding differences between
employee attrition cultures and reasons why various cultures
Predicting what will viewers watch and when developed and are as they are currently
on TV for a large Cable company Coaching / Mentoring / Teaching / Helping people
Building demand forecast models reach their highest potential
Implementing scoring engines & building
simulators
Big Data & Analytics - Why Should We Care? Vishwa Kolla | vish.kolla@gmail.com March 27, 2013 2
3. Contents
What is Big Data?
Why is Big Data Important?
How does Big Data manifest in our daily lives?
Who is into Big Data?
What skills are required to master Big Data?
How can I get started?
Big Data & Analytics - Why Should We Care? March 27, 2013 3
4. What is Big Data?
Big data is high-volume, high-velocity and high-variety information assets
that demand cost-effective, innovative forms of information processing for
enhanced insight and decision making.
- Gartner
Source(s): (1) Gartner
Big Data & Analytics - Why Should We Care? March 27, 2013 4
5. What is Big Data?
Volume of data created Worldwide
1 YB = 10^24 Bytes
Dawn of 2003 … 2012 2015
1 ZB = 10^21 Bytes
time
1 EB = 10^18 Bytes
1 PB = 10^15 Bytes
1TB = 10^12 Bytes
5 EB 2.7 ZB 10 ZB (E) 1 GB = 10^9 Bytes
Big Data Elements
Variety of data
Velocity Volume Radio Tweets Wikipedia
TV Blogs GPS data
News Photos RFID
Variety
E-Mails Videos (user POS
Facebook and paid) Scanners
Posts RSS feeds …
Velocity of data
Walmart handles 1M transactions per hour Facebook when had a user base of 900 M
Google processes 24PB of data per day users, had 25 PB of compressed data
AT&T transfers 30 PB of data per day 400M tweets per day in June ’12
90 trillion emails are sent per year 72 hours of video is uploaded to Youtube
World of Warcraft uses 1.3 PB of storage every minute
Source(s): (1) IBM’s Understanding Big Data eBook (2) Intel’s Big Data 101, (3) The Big Data Group (4) YouTube Press statistics
Big Data & Analytics - Why Should We Care? March 27, 2013 5
6. How Big is Big, Really?
Source(s): (1) Mozy.com
Big Data & Analytics - Why Should We Care? March 27, 2013 6
7. Big Data & Analytics Ecosystem – It revolves around improving people’s lives
5 Improving people’s lives is almost always the end goal
The uses of big data and analytics transcends industries, firms and functions
People
4 Desktop / Web / Mobile apps consume these insights
Apps &
E.g., Desktop -> Dashboards, Web -> Movie recommendations, Mobile
Devices (Restaurant recommendations)
3 Visualization & Visualization tools are used to better understand inherent patterns
The data is processed, transformed and analyzed to create insights
Analytics More often than not, scoring models are built that auto-generate insights
The format of the data is either
2 Data Store Structured (e.g. database tables)
(Structured & Unstructured) Un-structured (e.g., E-Mails, Blogs, Photos, Videos)
Data is generated from a wide variety of sources that are either
1 Data Providers Instrumented (e.g. POS scanners, Video surveillance cameras)
(Instrumented & Non-instrumented) Non-Instrumented (e.g., Facebook posts, Twitter feeds, blogs)
Big Data & Analytics - Why Should We Care? March 27, 2013 7
8. Contents
What is Big Data?
Why is Big Data Important?
How does Big Data manifest in our daily lives?
Who is into Big Data?
What skills are required to master Big Data?
How can I get started?
Big Data & Analytics - Why Should We Care? March 27, 2013 8
9. It is not about having a lot of data; it is about USING data effectively
Value gap as perceived by the
market. Effective use of big data
amongst other things is an
important driver of this gap
Source(s): Google finance
Big Data & Analytics - Why Should We Care? March 27, 2013 9
10. It is not really about Big Data, but is really about Tiny Data (i.e, INSIGHTS)
Who should I hire? What is similar to
this customer?
Given weather
Who is likely to patterns, what
attrite? should I sell?
What will demand
be in 2014?
Who is likely to Which ad will
respond to an this customer
offer? How much should I watch?
spend on marketing?
What is at the
What should What does this risk of default?
I offer? customer value?
Who is likely
Who will this How much stock to vote for the
customer watch? should I carry? democrats?
Big Data & Analytics - Why Should We Care? March 27, 2013 10
11. Contents
What is Big Data?
Why is Big Data Important?
How does Big Data manifest in our daily lives?
Who is into Big Data?
What skills are required to master Big Data?
How can I get started?
Big Data & Analytics - Why Should We Care? March 27, 2013 11
12. Then and Now – Marketing
Then Now
Marketing Leads Campaign Recommendations
Source(s): (1) Big Data Trends by David Feinleib
Big Data & Analytics - Why Should We Care? March 27, 2013 12
13. Then and Now – Selling
Then Now
One size fits all Personalization & Targeted Selling
Source(s): (1) Big Data Trends by David Feinleib
Big Data & Analytics - Why Should We Care? March 27, 2013 13
14. Then and Now – IT
Then Now
Peruse through log files Interactive Dashboards
Source(s): (1) Big Data Trends by David Feinleib
Big Data & Analytics - Why Should We Care? March 27, 2013 14
15. Then and Now – Customer Service
Then Now
Reactive Customer Service Pro-active Customer Service
Source(s): (1) Big Data Trends by David Feinleib
Big Data & Analytics - Why Should We Care? March 27, 2013 15
16. Then and Now – Credibility
Then Now
Credit Databases Professional & Social Networks
Source(s): (1) Big Data Trends by David Feinleib
Big Data & Analytics - Why Should We Care? March 27, 2013 16
17. Then and Now – Operations
Then Now
Maps Location Based Services
Source(s): (1) Big Data Trends by David Feinleib
Big Data & Analytics - Why Should We Care? March 27, 2013 17
18. Then and Now – Medical Research
Then Now
Keyword searches Word Clouds
Source(s): (1) Big Data Trends by David Feinleib
Big Data & Analytics - Why Should We Care? March 27, 2013 18
19. Then and Now – Fitness
Then Now
Manual tracking Focus on the goal
Source(s): (1) Big Data Trends by David Feinleib
Big Data & Analytics - Why Should We Care? March 27, 2013 19
20. Contents
What is Big Data?
Why is Big Data Important?
How does Big Data manifest in our daily lives?
Who is into Big Data?
What skills are required to master Big Data?
How can I get started?
Big Data & Analytics - Why Should We Care? March 27, 2013 20
21. The Big Data buzz has begun; every one is into it …
WSJ Books / Articles
• Teaming up on Big Data • IBM’s E-Book
• Re-inventing society in the wake of Big Data • Deloitte E-Book
• Wanted – A few good data scientists • HBR – The management revolution
• Big Data adds nickels and dimes to Giant Wind Farm • HBR – Making Advanced Analytics work for you
• Visa uses Big Data in Fraud detection • HBR – Next best offer
• How Big Data is changing the Whole Equation of • Amazon books
Business
• Moneyball, VC Style (using Big Data)
Big Data in Various Industries
• Big Data, Big Blunders
• Healthcare
• The New Shape of Big Data
• Financial Services
• What your CEO is reading – Steam Engines Meet Big
Data • Big Data in Insurance
• Retail
A few company sites about Big Data
• Deloitte’s Big Data site Big Data in Various Functions
• PWC’s Big Data site • Marketing
• IBM’s Big Data site • Operations
• Intel’s Big Data site • HR
• Microsoft’s Big Data site • Finance
• Walmart
Big Data & Analytics - Why Should We Care? March 27, 2013 21
22. … and they are into it very seriously
Big Data & Analytics - Why Should We Care? March 27, 2013 22
23. Contents
What is Big Data?
Why is Big Data Important?
How does Big Data manifest in our daily lives?
Who is into Big Data?
What skills are required to master Big Data?
How can I get started?
Big Data & Analytics - Why Should We Care? March 27, 2013 23
24. Skills Required to Master Big Data
Leadership
Management
5 Administrative
Consulting
People
People
4 Web 2.0
Apps & Mobile Apps
Devices Device specific - iOS / Andriod
Device agnostic – HTML 5.0
3 Visualization & Effective Data visualization techniques
Statistical & Probabilistic techniques
Analytics Analytical methods, tools & processes
Cloud
2 Data Store RDBMS (SQL)
(Structured & Unstructured) NoSQL, Hadoop
Hardware engineering
1 Data Providers Instrumentation & Design
(Instrumented & Non-instrumented) Content generators (FB posts, blogs, videos, photos)
Big Data & Analytics - Why Should We Care? March 27, 2013 24
25. Skills Required to Master Big Data & Analytics
Customer Analytics Marketing Analytics Lifestyle & Life Stage
Profitable growth Pricing Insurance Premium Pricing
opportunities Price & demand Detecting diseases based on
Next best offer optimization lifestyle
Cross-Sell Market Mix
Indus Functi
tries Fraud Analytics Workforce Analytics Subscription Analytics ons
Fraudulent claims Hiring Credit Score
Fraudulent transactions Growing Analytics in the cloud
Retaining
Statistical & Visualization Programming Genuine
Probabilistic Techniques & Trouble- Curiosity
Techniques shooting
Big Data & Analytics - Why Should We Care? March 27, 2013 25
26. Skills Required to Master Big Data & Analytics – Some Tools to Learn
Source(s): http://www.bigdatalandscape.com/
Big Data & Analytics - Why Should We Care? March 27, 2013 26
27. Skills Required to Master Big Data – Example 1 of effective visualization
Big Data & Analytics - Why Should We Care? March 27, 2013 27
28. Skills Required to Master Big Data – Example 2 of effective visualization
Source(s): Visual News
Big Data & Analytics - Why Should We Care? March 27, 2013 28
29. Contents
What is Big Data?
Why is Big Data Important?
How does Big Data manifest in our daily lives?
What skills are required to master Big Data?
Who is into Big Data?
How can I get started?
Big Data & Analytics - Why Should We Care? March 27, 2013 29
30. Navigating Big Data and Analytics is a Journey
Master of
1. Develop your
Big Data Establish eminence (by
& Analytics publishing your work)
1. Solve the same problem across
industries
Grow 2. Solve different problems across
industries
3. Apply methods across functions
1. Learn industry best practices when you get
hired into a firm
2. Surround yourself with good people and
Learn from the experts experts to accelerate your learning
3. Build / implement models under the guidance
of an expert
1. Pay attention in Probability & Statistics courses
2. Learn at least one programming language thoroughly and a few if
you can
3. Recommended minimum tool sets: R, SAS, Tableau
Foundation 4. Take advanced level analytical courses such as New Product
(School) Introduction, Optimizations, Operations Research, Data-mining,
Modeling, Forecasting & Time Series, Simulations
5. Practice solving problems end-to-end to understand the
implication of building models and implementing them in real life
Big Data & Analytics - Why Should We Care? March 27, 2013 30
31. Some things to watch out for
1. Big Data is not a panacea
2. Big Data is not everything for everybody
3. Big Data does not have all the answers and is directional at best if done right
4. Big Data & Analytics do not replace human intelligence ; Relying solely on Data & Analytics usually trips one up
5. There are several limitations of using Big Data & Analytics. Some are:
a) Data collection limitation -> Not all data can and is collected. One may have access to a ton of data, but very
little can be analyzed and/or is meaningful
b) Data quality limitation -> Garbage in garbage out; this is getting better every day
c) Data transformation limitations -> Raw data is rarely used. It is almost always transformed. There is no perfect
transformation
d) Measurement limitation -> Metrics cannot capture the entire picture
e) Modeling limitation -> Not every relationship can be modeled. The models mostly confirm / deny hypotheses.
Again, models need to be evaluated for their predictive strength before adoption
f) Interpretation limitation -> One needs to be careful when interpreting results and often misinterpretations of
data / metrics / model insights can be dangerous
g) Actionability limitation -> Not all insights are actionable. They may very well be interesting, but one cannot act
on most insights
h) Using / Relying on single data source / data point -> Coming to a conclusion based on a single or very few
biased data points can often happen
6. At the end of the day, to make Big Data & Analytics work for you, one needs to question the outcomes and
insights, reconcile with understanding and use the insights as illumination as opposed to for support
Big Data & Analytics - Why Should We Care? March 27, 2013 31
32. Summary
1. Big Data is Big. It is easy to get lost. Know and understand what you are getting into
before you leap
2. Make up your mind of where you want to play (i.e., get into the area where your
strengths lie)
3. Build a roadmap of where you want to go and how you are going to get there
4. Fill in the skill gaps
5. Surround yourself with good people. You are a sum total of who and what you interact
with
6. Have fun and enjoy what you are doing
Big Data & Analytics - Why Should We Care? March 27, 2013 32