2. What is Big Data?
•
“Big data," is a group of data technologies that
are making the storage, manipulation and
analysis of large volumes of data cheaper and
faster than ever.
•
Types of “Big data”
–
Transactional Data
–
Data from mobile app
• Location data , Profiles
2
3. Big Data Challenge
•
Managing the three “V”s of big data
–
Volume
–
Velocity
• The speed at which data is coming and changing
–
Variety
• Text, Audio, Video
•
Big Data is mainly unstructured data
3
•
Technology to store big data
4. The Business Needs
•
Traditionally business wanted answers to Five
Questions
•
Traditional BI answers two of those questions
–
What Happened? – Reports and Ad-hoc Queries
–
Why it Happened? – Analytics, Cubes
•
Dash Boards and Score Cards Answer the third
–
What is happening Now?
• 4
5. Big Data Opportunity
•
The relational databases has limitations
–
Data needs to be modeled
–
Need to know the business needs to create good
data models
–
Data needs to be structured to support queries
•
Can we do analytics on big data and answer all
Five business questions?
5
12. Big Data Opportunities
•
McKinsey projects that in the U.S. alone, there will
be a need by 2018 for 140,000 to 190,000 “data
scientists”
•
Steep technical learning curves and a lack of
qualified technical staff create barriers to adoption
12
13. Big Data Opportunities
•
Need for another 1.5 million data-literate
managers
–
Formal training in predictive analytics and
statistics.
•
The technologies in the big data area are not
Analyst Friendly
–
Need Programmers with knowledge of Hadoop,
Statistics and analytics
• Companies Retraining programmers and13
database
17. DMA Campaign Response Rates
•
2010 rate of 3.72% and an
Email to a house list averaged a 19.47% open rate, a 6.64% click-through rate,
and a 1.73% conversion rate, with a bounce-back
unsubscribe rate of 0.77%.
•
Direct mail: Letter-sized envelopes had a response rate this year of 3.42% for a
house list and 1.38% for a prospect list.
•
Catalogs had the lowest cost per order of $47.61, just ahead of inserts at
$47.69, email at $53.85, and postcards $75.32.
•
Outbound telemarketing to prospects had the highest cost per order of
$309.25, but it also had the highest response rate from prospects of 6.16%.
•
Paid search had an average cost per click of $3.79, with a 3.81% conversion
rate. The conversion rate (after click) of Internet display advertisements was
slightly higher at 4.43%.
17
20. Improving Offer Acceptance Rate: Algorithms to
Personalize Offers
•
K-Means Clustering for clustering Users
–
Cluster users based on brand preferences and
demographics
–
Most popular Clustering Algorithm
•
Logistic regression for finding the probability
of accepting an offer
•
SVD (Single Value Decomposition) to reduce
dimensionality of data and to reduce noise
–
Reducing the dimensions to a few improves
performance and reduce accuracy 20
22. How Does The Model Work?
–
Classification Algorithms learns from Examples in
a process known as Training
–
Need Training Data and Decide on Training
Algorithm 22
23. Choosing Products for customer and Ordering
Customer
Details
Click Prediction
Sale Items Model for Product
Items Display
Chosen Order
23
24. Conclusion
•
On the basis of our on-line surveys, face-to-
face survey and analysis of studies done by
others we conclude that the opportunity for a
Marketing application based on Big data and
Machine Learning is great. In a scale of 1-10
we rate this opportunity at 9
24