SlideShare una empresa de Scribd logo
1 de 29
Case studies w/Analytics, Real
Time, DM/ML in a hackathon
L’Oreal 8/27/2013
Not Hadoop
Agenda
• Problem Statement:
– Digital and Retail behavior analysis:
• Long tail problem similarities
– Propensity Marketing:
• Propensity for consumer to respond to promotion?
• Cover DM/ML Demographics presentation
– Profitability Marketing
• Who are the most profitable customers?
• Obvious answer, select * from customers join orders order by amt
desc;
– Promotion Modeling
• What drives order values and who should receive promotions?
What do I do
• Work, Tech lead Google, ~10y, Architect
Absolute SW
• Teach, mentor others on Big Data, Hadoop,
DM/ML
• http://www.meetup.com/HandsOnProgrammi
ngEvents/.
Review
• Theory:
– What is long tail?
– Long tail success case studies
– Demographic targeting/Modeling and prediction
– ML/DM success case studies
• Data Analysis Strategies/Structure
What is the Long Tail?
• Originated from search engines/Google
• Don’t focus on the top 20% queries, focus on
the bottom 50% first
• Why? The bottom 50% was the hardest:
LP&SB. The top 20% was automatic
Long Tail Example, keywords
Keyword Lift/Complementary
Strategies
• 70% of the keywords are not used frequently.
• Page Rank/feature selection/Spam reduction
– Most data (demographics is inaccurate, eBay problem)
• Quality of features enable ML/DM modeling
– Identify these words first using simple SQL queries
then run a model and use A/B testing to iterate to
better results
– Example of ML/DM later
• Case study of data visualisation for search query
length
Complete solution not possible
• A complete solution to the long tail is not
possible via a hackathon
• Examples of Complete Solutions
– Example: Symantec uses modified page rank to see if
virus files are safe/not safe. Viruses are different, all
are unique. You can’t rely on past examples. >90%
accuracy rate. Uses people feedback.
– Example: Yahoo content system matching users to
content ~100 attributes->1k attributes. Most users
only go to Yahoo news for a few stories. MM guides
this
Another long tail on search query
length
Long Tail
• Obvious longer queries imply user wants more precise
result. Precision vs. Recall
• Obvious these users are more valuable b/c the directed
intent is more focused. Showing the user enter in queries
with more precision is very very valuable for shopping and
other applications with focused directed intent
• The above case results in a $50.00 click to Google for
Salesforce/SAP ads (e.g home financing/mortgages)
• Best way to see this is in a demo:
 Move mouse on dots which are close to each other:
http://dataincolour.com:8888/#1144645000
 DEMO!!!!!
Example real time applied to previous
example
 We looked at search keywords and search phrase
length. Visualizations as a substitute for Machine
Learning algorithms. Much faster to implement
 Some students <~20 years old did this in a
weekend hackathon:
http://www.dataincolour.com/2011/06/curiousn
akes-visualization-of-aol-questions/
 http://datainsightsf.com/schedule-2/ Not
repeated
What to do?
• Brainstorm some more, definitely something here, play
w/data; will come in time. The most important part is
the definition of the problem, not the code
– Think more code less
• Should you copy the data visualisation example on
Search Query Length?
– Probably not
• A long long time ago Google displayed the incoming
search queries in the lobby; this had practical use
• Real time constrain the problem, less complicated
processing, less about the algorithm, more about the
user
Why Real Time? Long Tail
 Do I really need real time? Yes, why?
 Pre2010 Google search displayed all the results, a
combination of precision and recall.
 Post 2010 Google went to instant search, limited recall.
Nobody drilled down to the 1Mth page for DVDs.
Better ads results with real time
 Analytics today is similar to pre2010 Google search,
batch processing using click logs
 Real time analytics mostly custom solutions but can be
much more effective. Once user leaves the website too
late to do anything. Many orders of magnitude
difference. Precision >> Recall
UI:mouse over a stream of dots
Mouse on a dot which is part of a
group which looks like a snake
 Can see what user typed in as queries after
another, here is one example;
 How to fix car-> What is a fuel filter-> How to
replace a fuel filter.
 This is valuable in adding additional features
to the user who asked this
 Can't get this from SQL queries easily or at all.
What is the lesson here?
• Viewing data in real time has value
• Minimum it helps clear the thinking for the
next step
• Use as an alerting system/QC process to show
if ML/DM is running correctly (proprietary in
Google/Yahoo). Every business has these.
• Key: visible to everybody w/o running a SQL
query
Wisdom gained matches across 2
hackathons
• One of the most surprising pieces of work was
a unique data visualization from the DM
hackathon
• None of these positive results were defined in
the problem statement. Required creativity.
• Careful
Review ML/DM
• Review a small subset of these slides:
– http://www.slideshare.net/DougChang1/demographic
s-andweblogtargeting-10757778
• Agenda: review a case study of the Motley Fool
and how to create/target promotions to likely
subscribers for problem #2, propensity marketing
• Case study of a past hackathon.
– My role: I seed the ideas, Mike Bowles, Nick Kolegraff
ML/DM Slides
• DO NOT INSERT SLIDES, cover the original so
we don’t limit the scope of audience
questions
ML/DM and Hackathons
• Done 2 as examples,
– Motley Fool, cosponsored by Kaggle (Mike Bowles)
– Best Buy, paid Kaggle (Nick Kolegraff@Accenture/DM
SIG, we sought him out)
– These events require guidance/very successful, both still
are receptive to more DM/ML events
• Careful: an algorithm doesn’t mean you have a
production process or something someone can
manage via a paid analyst headcount
• Why aren’t there more? Time investment to clean
data, tech talk to guide participants, min 3 months
work
What do I do for others which may
help you?
• Seed the ideas; should add a structure to this. NDA. Run
SQL queries
• Current Case Study
– Starting to do the prep work for another real time analytics
example, teaching from this
– Nick/Mike did this for the other 2 hackathons.
• Match the strategy w/structure
– Take time off work to build an engineering prototype (Twitter
Storm in old slide deck)
– Not covering this here
– Strategy: first display the data in a real time dashboard then
iterate the visualizations, then add DM/ML algorithms after the
A/B testing framework is complete
One example, real time analytics, web
page heat maps
Amazon Web Page
Google Shopping
Example/Reversed/Why?
Upper Left hand corner
Example of Kiehls
Kiehl’s Example
• Put in offers w/($ amount, product desc, click url)
customized per user, A/B test layouts and placement,
store data for customization and measure lift
• Measure facebook ads via page rank
• Predict missing links application
• http://blog.echen.me/2012/07/31/edge-prediction-in-
a-social-graph-my-solution-to-facebooks-user-
recommendation-contest-on-kaggle/
• Careful, don’t copy. Example only. Generalize to
hackathon. Many other ideas
• Your answer is different from Yahoo & Google.
This isn’t a roadmap.
Promotion Modeling
• Is this a long tail problem?
– How to formulate the graph and influence across
nodes?
– Which features to select to use for modeling?
– Still ok if you don’t have the long tail answer.
Follow the Demographics Customer modeling ex.
• How to change the model over time?
• Metrics for promotion effectiveness
– Facebook campaigns are easy to iterate and run.
Still need some form of A/B testing
Structure has to match Strategy
• Partner w/Macy’s? Develop a structure to
work with retail partners to increase their
sales
– E.g. customized shopkick
– Don’t just release APIs, release mobile app source
code ppl can modify
• Test promotions and building profiles?
• … lots of ideas

Más contenido relacionado

La actualidad más candente

How Machine Learning Can Transform The Customer Experience
How Machine Learning Can Transform The Customer ExperienceHow Machine Learning Can Transform The Customer Experience
How Machine Learning Can Transform The Customer ExperienceProduct School
 
Prioritization in Product Management
Prioritization in Product ManagementPrioritization in Product Management
Prioritization in Product ManagementPrashant Mahajan
 
How to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOHow to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOProduct School
 
Data-driven product management
Data-driven product managementData-driven product management
Data-driven product managementArseny Kravchenko
 
UX Analytics Cemal Büyükgökçesu
UX Analytics Cemal BüyükgökçesuUX Analytics Cemal Büyükgökçesu
UX Analytics Cemal BüyükgökçesuUserspots
 
Data Science: The Product Manager's Primer
Data Science: The Product Manager's PrimerData Science: The Product Manager's Primer
Data Science: The Product Manager's PrimerProduct School
 
Identifying and improving top tasks
Identifying and improving top tasksIdentifying and improving top tasks
Identifying and improving top tasksMichele Ide-Smith
 
Matthew Roach, Sanoma (AUS/UK/NL) - Conversion Hotel 2017 - keynote
Matthew Roach, Sanoma (AUS/UK/NL) - Conversion Hotel 2017 - keynoteMatthew Roach, Sanoma (AUS/UK/NL) - Conversion Hotel 2017 - keynote
Matthew Roach, Sanoma (AUS/UK/NL) - Conversion Hotel 2017 - keynoteOnline Dialogue
 
Michelle Kiss, Analytics Demystified (AUS/NL/USA) - Conversion Hotel 2017 - k...
Michelle Kiss, Analytics Demystified (AUS/NL/USA) - Conversion Hotel 2017 - k...Michelle Kiss, Analytics Demystified (AUS/NL/USA) - Conversion Hotel 2017 - k...
Michelle Kiss, Analytics Demystified (AUS/NL/USA) - Conversion Hotel 2017 - k...Online Dialogue
 
Solution Design - The Hidden Side of UX (for Product Managers)
Solution Design - The Hidden Side of UX (for Product Managers)Solution Design - The Hidden Side of UX (for Product Managers)
Solution Design - The Hidden Side of UX (for Product Managers)Joe Baz
 
Communicating data: Reporting user research
Communicating data: Reporting user researchCommunicating data: Reporting user research
Communicating data: Reporting user researchPuja Parakh
 
UX Field Research Toolkit - Updated for Big Design 2018
UX Field Research Toolkit - Updated for Big Design 2018UX Field Research Toolkit - Updated for Big Design 2018
UX Field Research Toolkit - Updated for Big Design 2018Kelly Moran
 
Lean LaunchPad: Analytics Workshop
Lean LaunchPad: Analytics WorkshopLean LaunchPad: Analytics Workshop
Lean LaunchPad: Analytics WorkshopStanford University
 
Hypothesis-Driven Development & How to Fail-Fast Hacking Growth
Hypothesis-Driven Development & How to Fail-Fast Hacking GrowthHypothesis-Driven Development & How to Fail-Fast Hacking Growth
Hypothesis-Driven Development & How to Fail-Fast Hacking GrowthPrabhat Gupta
 
How to Quantitatively Measure Your User Experience
How to Quantitatively Measure Your User ExperienceHow to Quantitatively Measure Your User Experience
How to Quantitatively Measure Your User ExperienceRichard Dalton
 

La actualidad más candente (20)

How Machine Learning Can Transform The Customer Experience
How Machine Learning Can Transform The Customer ExperienceHow Machine Learning Can Transform The Customer Experience
How Machine Learning Can Transform The Customer Experience
 
Prioritization in Product Management
Prioritization in Product ManagementPrioritization in Product Management
Prioritization in Product Management
 
How to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOHow to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPO
 
Service x berkeley 2015
Service x berkeley 2015Service x berkeley 2015
Service x berkeley 2015
 
Data-driven product management
Data-driven product managementData-driven product management
Data-driven product management
 
UX Analytics Cemal Büyükgökçesu
UX Analytics Cemal BüyükgökçesuUX Analytics Cemal Büyükgökçesu
UX Analytics Cemal Büyükgökçesu
 
SalesStash Berkeley 2016
SalesStash Berkeley 2016SalesStash Berkeley 2016
SalesStash Berkeley 2016
 
Data Science: The Product Manager's Primer
Data Science: The Product Manager's PrimerData Science: The Product Manager's Primer
Data Science: The Product Manager's Primer
 
Managing Top Tasks
Managing Top TasksManaging Top Tasks
Managing Top Tasks
 
Identifying and improving top tasks
Identifying and improving top tasksIdentifying and improving top tasks
Identifying and improving top tasks
 
Matthew Roach, Sanoma (AUS/UK/NL) - Conversion Hotel 2017 - keynote
Matthew Roach, Sanoma (AUS/UK/NL) - Conversion Hotel 2017 - keynoteMatthew Roach, Sanoma (AUS/UK/NL) - Conversion Hotel 2017 - keynote
Matthew Roach, Sanoma (AUS/UK/NL) - Conversion Hotel 2017 - keynote
 
Michelle Kiss, Analytics Demystified (AUS/NL/USA) - Conversion Hotel 2017 - k...
Michelle Kiss, Analytics Demystified (AUS/NL/USA) - Conversion Hotel 2017 - k...Michelle Kiss, Analytics Demystified (AUS/NL/USA) - Conversion Hotel 2017 - k...
Michelle Kiss, Analytics Demystified (AUS/NL/USA) - Conversion Hotel 2017 - k...
 
Solution Design - The Hidden Side of UX (for Product Managers)
Solution Design - The Hidden Side of UX (for Product Managers)Solution Design - The Hidden Side of UX (for Product Managers)
Solution Design - The Hidden Side of UX (for Product Managers)
 
Share and Tell Stanford 2016
Share and Tell Stanford 2016Share and Tell Stanford 2016
Share and Tell Stanford 2016
 
Communicating data: Reporting user research
Communicating data: Reporting user researchCommunicating data: Reporting user research
Communicating data: Reporting user research
 
UX Field Research Toolkit - Updated for Big Design 2018
UX Field Research Toolkit - Updated for Big Design 2018UX Field Research Toolkit - Updated for Big Design 2018
UX Field Research Toolkit - Updated for Big Design 2018
 
Lean LaunchPad: Analytics Workshop
Lean LaunchPad: Analytics WorkshopLean LaunchPad: Analytics Workshop
Lean LaunchPad: Analytics Workshop
 
Hypothesis-Driven Development & How to Fail-Fast Hacking Growth
Hypothesis-Driven Development & How to Fail-Fast Hacking GrowthHypothesis-Driven Development & How to Fail-Fast Hacking Growth
Hypothesis-Driven Development & How to Fail-Fast Hacking Growth
 
Apps monetization
Apps monetizationApps monetization
Apps monetization
 
How to Quantitatively Measure Your User Experience
How to Quantitatively Measure Your User ExperienceHow to Quantitatively Measure Your User Experience
How to Quantitatively Measure Your User Experience
 

Destacado

Data Analytics Hackathon Presentation
Data Analytics Hackathon PresentationData Analytics Hackathon Presentation
Data Analytics Hackathon PresentationChethan Mittapalli
 
Marketing hackv2 engagingsocialcustomers
Marketing hackv2 engagingsocialcustomersMarketing hackv2 engagingsocialcustomers
Marketing hackv2 engagingsocialcustomersAmber Telfer
 
WSO2 Virtual Hackathon Big Data in the Cloud Case Study
WSO2 Virtual Hackathon Big Data in the Cloud Case StudyWSO2 Virtual Hackathon Big Data in the Cloud Case Study
WSO2 Virtual Hackathon Big Data in the Cloud Case StudyLakmal Warusawithana
 
Accenture Data Science Hackathon Presentation
Accenture Data Science Hackathon PresentationAccenture Data Science Hackathon Presentation
Accenture Data Science Hackathon Presentationams345
 
Wipro hackathon tekpill
Wipro hackathon tekpillWipro hackathon tekpill
Wipro hackathon tekpillAnkit Kashyap
 
Georgia Tech Data Science Hackathon September 2016
Georgia Tech Data Science Hackathon September 2016Georgia Tech Data Science Hackathon September 2016
Georgia Tech Data Science Hackathon September 2016Mark Tabladillo
 
National Hackathon - Problem Statements
National Hackathon - Problem StatementsNational Hackathon - Problem Statements
National Hackathon - Problem StatementsZaki Haider
 
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...Data Science Milan
 
AI For Enterprise
AI For EnterpriseAI For Enterprise
AI For EnterpriseNVIDIA
 
Building an AI Startup: Realities & Tactics
Building an AI Startup: Realities & TacticsBuilding an AI Startup: Realities & Tactics
Building an AI Startup: Realities & TacticsMatt Turck
 

Destacado (11)

Data Analytics Hackathon Presentation
Data Analytics Hackathon PresentationData Analytics Hackathon Presentation
Data Analytics Hackathon Presentation
 
Marketing hackv2 engagingsocialcustomers
Marketing hackv2 engagingsocialcustomersMarketing hackv2 engagingsocialcustomers
Marketing hackv2 engagingsocialcustomers
 
WSO2 Virtual Hackathon Big Data in the Cloud Case Study
WSO2 Virtual Hackathon Big Data in the Cloud Case StudyWSO2 Virtual Hackathon Big Data in the Cloud Case Study
WSO2 Virtual Hackathon Big Data in the Cloud Case Study
 
Accenture Data Science Hackathon Presentation
Accenture Data Science Hackathon PresentationAccenture Data Science Hackathon Presentation
Accenture Data Science Hackathon Presentation
 
Web Analytics
Web AnalyticsWeb Analytics
Web Analytics
 
Wipro hackathon tekpill
Wipro hackathon tekpillWipro hackathon tekpill
Wipro hackathon tekpill
 
Georgia Tech Data Science Hackathon September 2016
Georgia Tech Data Science Hackathon September 2016Georgia Tech Data Science Hackathon September 2016
Georgia Tech Data Science Hackathon September 2016
 
National Hackathon - Problem Statements
National Hackathon - Problem StatementsNational Hackathon - Problem Statements
National Hackathon - Problem Statements
 
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
The Barclays Data Science Hackathon: Building Retail Recommender Systems base...
 
AI For Enterprise
AI For EnterpriseAI For Enterprise
AI For Enterprise
 
Building an AI Startup: Realities & Tactics
Building an AI Startup: Realities & TacticsBuilding an AI Startup: Realities & Tactics
Building an AI Startup: Realities & Tactics
 

Similar a L'Oreal Tech Talk

Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016MLconf
 
How to Use Data to Inform Your Design and Drive Your Business
How to Use Data to Inform Your Design and Drive Your BusinessHow to Use Data to Inform Your Design and Drive Your Business
How to Use Data to Inform Your Design and Drive Your BusinessKissmetrics on SlideShare
 
How to Use Data to Drive Product Decisions by PayPal PM
How to Use Data to Drive Product Decisions by PayPal PMHow to Use Data to Drive Product Decisions by PayPal PM
How to Use Data to Drive Product Decisions by PayPal PMProduct School
 
NACDEP 2015 - Are we entrepreneurs?
NACDEP 2015  - Are we entrepreneurs?NACDEP 2015  - Are we entrepreneurs?
NACDEP 2015 - Are we entrepreneurs?Glenn Muske
 
Why And How to Transition into Product Management by Google PM
Why And How to Transition into Product Management by Google PMWhy And How to Transition into Product Management by Google PM
Why And How to Transition into Product Management by Google PMProduct School
 
Agile and data driven product development oleh Dhiku VP Product KMK Online
Agile and data driven product development oleh Dhiku VP Product KMK OnlineAgile and data driven product development oleh Dhiku VP Product KMK Online
Agile and data driven product development oleh Dhiku VP Product KMK OnlineRein Mahatma
 
Analytic next gen usecases - presented for ISB, Hyderabad
Analytic next gen usecases - presented for ISB, HyderabadAnalytic next gen usecases - presented for ISB, Hyderabad
Analytic next gen usecases - presented for ISB, HyderabadSandeep akinapelli
 
Impactful Product Management by MessageBird and eBay, Marktplaats PMs
Impactful Product Management by MessageBird and eBay, Marktplaats PMsImpactful Product Management by MessageBird and eBay, Marktplaats PMs
Impactful Product Management by MessageBird and eBay, Marktplaats PMsProduct School
 
How to Build Winning Products by Microsoft Sr. Product Manager
How to Build Winning Products by Microsoft Sr. Product ManagerHow to Build Winning Products by Microsoft Sr. Product Manager
How to Build Winning Products by Microsoft Sr. Product ManagerProduct School
 
5 Tips to Bulletproof Your Analytics Implementation
5 Tips to Bulletproof Your Analytics Implementation5 Tips to Bulletproof Your Analytics Implementation
5 Tips to Bulletproof Your Analytics ImplementationObservePoint
 
Professional Project Manager Should Be Proficient in Agile
Professional Project Manager Should Be Proficient in AgileProfessional Project Manager Should Be Proficient in Agile
Professional Project Manager Should Be Proficient in AgileNitor
 
Surge engr 245 lean launchpad stanford 2020
Surge engr 245 lean launchpad stanford 2020Surge engr 245 lean launchpad stanford 2020
Surge engr 245 lean launchpad stanford 2020Stanford University
 
Early Stage Product Development - Incubadora Sinergia
Early Stage Product Development - Incubadora SinergiaEarly Stage Product Development - Incubadora Sinergia
Early Stage Product Development - Incubadora SinergiaRiley Maguire
 
Machine Learning Adoption: Crossing the chasm for banking and insurance sector
Machine Learning Adoption: Crossing the chasm for banking and insurance sectorMachine Learning Adoption: Crossing the chasm for banking and insurance sector
Machine Learning Adoption: Crossing the chasm for banking and insurance sectorRudradeb Mitra
 
Building Startups and Minimum Viable Products (NDC2013)
Building Startups and Minimum Viable Products (NDC2013)Building Startups and Minimum Viable Products (NDC2013)
Building Startups and Minimum Viable Products (NDC2013)Ben Hall
 
How to Scale and Grow your Enterprise Technical SEO Strategy
How to Scale and Grow your Enterprise Technical SEO StrategyHow to Scale and Grow your Enterprise Technical SEO Strategy
How to Scale and Grow your Enterprise Technical SEO StrategySearch Engine Journal
 
Feature Prioritization Techniques for an Agile PMs by Microsoft PM
Feature Prioritization Techniques for an Agile PMs by Microsoft PMFeature Prioritization Techniques for an Agile PMs by Microsoft PM
Feature Prioritization Techniques for an Agile PMs by Microsoft PMProduct School
 
How to Make Your Resume Product Friendly by Ticketmaster PM
How to Make Your Resume Product Friendly by Ticketmaster PMHow to Make Your Resume Product Friendly by Ticketmaster PM
How to Make Your Resume Product Friendly by Ticketmaster PMProduct School
 

Similar a L'Oreal Tech Talk (20)

Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
Elena Grewal, Data Science Manager, Airbnb at MLconf SF 2016
 
How to Use Data to Inform Your Design and Drive Your Business
How to Use Data to Inform Your Design and Drive Your BusinessHow to Use Data to Inform Your Design and Drive Your Business
How to Use Data to Inform Your Design and Drive Your Business
 
How to Use Data to Drive Product Decisions by PayPal PM
How to Use Data to Drive Product Decisions by PayPal PMHow to Use Data to Drive Product Decisions by PayPal PM
How to Use Data to Drive Product Decisions by PayPal PM
 
Demystifying ML/AI
Demystifying ML/AIDemystifying ML/AI
Demystifying ML/AI
 
NACDEP 2015 - Are we entrepreneurs?
NACDEP 2015  - Are we entrepreneurs?NACDEP 2015  - Are we entrepreneurs?
NACDEP 2015 - Are we entrepreneurs?
 
Why And How to Transition into Product Management by Google PM
Why And How to Transition into Product Management by Google PMWhy And How to Transition into Product Management by Google PM
Why And How to Transition into Product Management by Google PM
 
Agile and data driven product development oleh Dhiku VP Product KMK Online
Agile and data driven product development oleh Dhiku VP Product KMK OnlineAgile and data driven product development oleh Dhiku VP Product KMK Online
Agile and data driven product development oleh Dhiku VP Product KMK Online
 
Analytic next gen usecases - presented for ISB, Hyderabad
Analytic next gen usecases - presented for ISB, HyderabadAnalytic next gen usecases - presented for ISB, Hyderabad
Analytic next gen usecases - presented for ISB, Hyderabad
 
Impactful Product Management by MessageBird and eBay, Marktplaats PMs
Impactful Product Management by MessageBird and eBay, Marktplaats PMsImpactful Product Management by MessageBird and eBay, Marktplaats PMs
Impactful Product Management by MessageBird and eBay, Marktplaats PMs
 
Google cloud certification data engineer
Google cloud certification data engineerGoogle cloud certification data engineer
Google cloud certification data engineer
 
How to Build Winning Products by Microsoft Sr. Product Manager
How to Build Winning Products by Microsoft Sr. Product ManagerHow to Build Winning Products by Microsoft Sr. Product Manager
How to Build Winning Products by Microsoft Sr. Product Manager
 
5 Tips to Bulletproof Your Analytics Implementation
5 Tips to Bulletproof Your Analytics Implementation5 Tips to Bulletproof Your Analytics Implementation
5 Tips to Bulletproof Your Analytics Implementation
 
Professional Project Manager Should Be Proficient in Agile
Professional Project Manager Should Be Proficient in AgileProfessional Project Manager Should Be Proficient in Agile
Professional Project Manager Should Be Proficient in Agile
 
Surge engr 245 lean launchpad stanford 2020
Surge engr 245 lean launchpad stanford 2020Surge engr 245 lean launchpad stanford 2020
Surge engr 245 lean launchpad stanford 2020
 
Early Stage Product Development - Incubadora Sinergia
Early Stage Product Development - Incubadora SinergiaEarly Stage Product Development - Incubadora Sinergia
Early Stage Product Development - Incubadora Sinergia
 
Machine Learning Adoption: Crossing the chasm for banking and insurance sector
Machine Learning Adoption: Crossing the chasm for banking and insurance sectorMachine Learning Adoption: Crossing the chasm for banking and insurance sector
Machine Learning Adoption: Crossing the chasm for banking and insurance sector
 
Building Startups and Minimum Viable Products (NDC2013)
Building Startups and Minimum Viable Products (NDC2013)Building Startups and Minimum Viable Products (NDC2013)
Building Startups and Minimum Viable Products (NDC2013)
 
How to Scale and Grow your Enterprise Technical SEO Strategy
How to Scale and Grow your Enterprise Technical SEO StrategyHow to Scale and Grow your Enterprise Technical SEO Strategy
How to Scale and Grow your Enterprise Technical SEO Strategy
 
Feature Prioritization Techniques for an Agile PMs by Microsoft PM
Feature Prioritization Techniques for an Agile PMs by Microsoft PMFeature Prioritization Techniques for an Agile PMs by Microsoft PM
Feature Prioritization Techniques for an Agile PMs by Microsoft PM
 
How to Make Your Resume Product Friendly by Ticketmaster PM
How to Make Your Resume Product Friendly by Ticketmaster PMHow to Make Your Resume Product Friendly by Ticketmaster PM
How to Make Your Resume Product Friendly by Ticketmaster PM
 

Más de Doug Chang

BRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning TalkBRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning TalkDoug Chang
 
Hadoop applicationarchitectures
Hadoop applicationarchitecturesHadoop applicationarchitectures
Hadoop applicationarchitecturesDoug Chang
 
Odersky week1 notes
Odersky week1 notesOdersky week1 notes
Odersky week1 notesDoug Chang
 
Spark Streaming Info
Spark Streaming InfoSpark Streaming Info
Spark Streaming InfoDoug Chang
 
Capital onehadoopclass
Capital onehadoopclassCapital onehadoopclass
Capital onehadoopclassDoug Chang
 
Capital onehadoopintro
Capital onehadoopintroCapital onehadoopintro
Capital onehadoopintroDoug Chang
 
Apache bigtopwg7142013
Apache bigtopwg7142013Apache bigtopwg7142013
Apache bigtopwg7142013Doug Chang
 
Bigtop june302013
Bigtop june302013Bigtop june302013
Bigtop june302013Doug Chang
 
Bigtop elancesmallrev1
Bigtop elancesmallrev1Bigtop elancesmallrev1
Bigtop elancesmallrev1Doug Chang
 
Hadoop/HBase POC framework
Hadoop/HBase POC frameworkHadoop/HBase POC framework
Hadoop/HBase POC frameworkDoug Chang
 
Demographics andweblogtargeting
Demographics andweblogtargetingDemographics andweblogtargeting
Demographics andweblogtargetingDoug Chang
 

Más de Doug Chang (13)

BRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning TalkBRV CTO Summit Deep Learning Talk
BRV CTO Summit Deep Learning Talk
 
Hapi
HapiHapi
Hapi
 
Hadoop applicationarchitectures
Hadoop applicationarchitecturesHadoop applicationarchitectures
Hadoop applicationarchitectures
 
Odersky week1 notes
Odersky week1 notesOdersky week1 notes
Odersky week1 notes
 
Spark Streaming Info
Spark Streaming InfoSpark Streaming Info
Spark Streaming Info
 
Capital onehadoopclass
Capital onehadoopclassCapital onehadoopclass
Capital onehadoopclass
 
Training
TrainingTraining
Training
 
Capital onehadoopintro
Capital onehadoopintroCapital onehadoopintro
Capital onehadoopintro
 
Apache bigtopwg7142013
Apache bigtopwg7142013Apache bigtopwg7142013
Apache bigtopwg7142013
 
Bigtop june302013
Bigtop june302013Bigtop june302013
Bigtop june302013
 
Bigtop elancesmallrev1
Bigtop elancesmallrev1Bigtop elancesmallrev1
Bigtop elancesmallrev1
 
Hadoop/HBase POC framework
Hadoop/HBase POC frameworkHadoop/HBase POC framework
Hadoop/HBase POC framework
 
Demographics andweblogtargeting
Demographics andweblogtargetingDemographics andweblogtargeting
Demographics andweblogtargeting
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Último (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

L'Oreal Tech Talk

  • 1. Case studies w/Analytics, Real Time, DM/ML in a hackathon L’Oreal 8/27/2013 Not Hadoop
  • 2. Agenda • Problem Statement: – Digital and Retail behavior analysis: • Long tail problem similarities – Propensity Marketing: • Propensity for consumer to respond to promotion? • Cover DM/ML Demographics presentation – Profitability Marketing • Who are the most profitable customers? • Obvious answer, select * from customers join orders order by amt desc; – Promotion Modeling • What drives order values and who should receive promotions?
  • 3. What do I do • Work, Tech lead Google, ~10y, Architect Absolute SW • Teach, mentor others on Big Data, Hadoop, DM/ML • http://www.meetup.com/HandsOnProgrammi ngEvents/.
  • 4. Review • Theory: – What is long tail? – Long tail success case studies – Demographic targeting/Modeling and prediction – ML/DM success case studies • Data Analysis Strategies/Structure
  • 5. What is the Long Tail? • Originated from search engines/Google • Don’t focus on the top 20% queries, focus on the bottom 50% first • Why? The bottom 50% was the hardest: LP&SB. The top 20% was automatic
  • 7. Keyword Lift/Complementary Strategies • 70% of the keywords are not used frequently. • Page Rank/feature selection/Spam reduction – Most data (demographics is inaccurate, eBay problem) • Quality of features enable ML/DM modeling – Identify these words first using simple SQL queries then run a model and use A/B testing to iterate to better results – Example of ML/DM later • Case study of data visualisation for search query length
  • 8. Complete solution not possible • A complete solution to the long tail is not possible via a hackathon • Examples of Complete Solutions – Example: Symantec uses modified page rank to see if virus files are safe/not safe. Viruses are different, all are unique. You can’t rely on past examples. >90% accuracy rate. Uses people feedback. – Example: Yahoo content system matching users to content ~100 attributes->1k attributes. Most users only go to Yahoo news for a few stories. MM guides this
  • 9. Another long tail on search query length
  • 10. Long Tail • Obvious longer queries imply user wants more precise result. Precision vs. Recall • Obvious these users are more valuable b/c the directed intent is more focused. Showing the user enter in queries with more precision is very very valuable for shopping and other applications with focused directed intent • The above case results in a $50.00 click to Google for Salesforce/SAP ads (e.g home financing/mortgages) • Best way to see this is in a demo:  Move mouse on dots which are close to each other: http://dataincolour.com:8888/#1144645000  DEMO!!!!!
  • 11. Example real time applied to previous example  We looked at search keywords and search phrase length. Visualizations as a substitute for Machine Learning algorithms. Much faster to implement  Some students <~20 years old did this in a weekend hackathon: http://www.dataincolour.com/2011/06/curiousn akes-visualization-of-aol-questions/  http://datainsightsf.com/schedule-2/ Not repeated
  • 12. What to do? • Brainstorm some more, definitely something here, play w/data; will come in time. The most important part is the definition of the problem, not the code – Think more code less • Should you copy the data visualisation example on Search Query Length? – Probably not • A long long time ago Google displayed the incoming search queries in the lobby; this had practical use • Real time constrain the problem, less complicated processing, less about the algorithm, more about the user
  • 13. Why Real Time? Long Tail  Do I really need real time? Yes, why?  Pre2010 Google search displayed all the results, a combination of precision and recall.  Post 2010 Google went to instant search, limited recall. Nobody drilled down to the 1Mth page for DVDs. Better ads results with real time  Analytics today is similar to pre2010 Google search, batch processing using click logs  Real time analytics mostly custom solutions but can be much more effective. Once user leaves the website too late to do anything. Many orders of magnitude difference. Precision >> Recall
  • 14. UI:mouse over a stream of dots
  • 15. Mouse on a dot which is part of a group which looks like a snake  Can see what user typed in as queries after another, here is one example;  How to fix car-> What is a fuel filter-> How to replace a fuel filter.  This is valuable in adding additional features to the user who asked this  Can't get this from SQL queries easily or at all.
  • 16. What is the lesson here? • Viewing data in real time has value • Minimum it helps clear the thinking for the next step • Use as an alerting system/QC process to show if ML/DM is running correctly (proprietary in Google/Yahoo). Every business has these. • Key: visible to everybody w/o running a SQL query
  • 17. Wisdom gained matches across 2 hackathons • One of the most surprising pieces of work was a unique data visualization from the DM hackathon • None of these positive results were defined in the problem statement. Required creativity. • Careful
  • 18. Review ML/DM • Review a small subset of these slides: – http://www.slideshare.net/DougChang1/demographic s-andweblogtargeting-10757778 • Agenda: review a case study of the Motley Fool and how to create/target promotions to likely subscribers for problem #2, propensity marketing • Case study of a past hackathon. – My role: I seed the ideas, Mike Bowles, Nick Kolegraff
  • 19. ML/DM Slides • DO NOT INSERT SLIDES, cover the original so we don’t limit the scope of audience questions
  • 20. ML/DM and Hackathons • Done 2 as examples, – Motley Fool, cosponsored by Kaggle (Mike Bowles) – Best Buy, paid Kaggle (Nick Kolegraff@Accenture/DM SIG, we sought him out) – These events require guidance/very successful, both still are receptive to more DM/ML events • Careful: an algorithm doesn’t mean you have a production process or something someone can manage via a paid analyst headcount • Why aren’t there more? Time investment to clean data, tech talk to guide participants, min 3 months work
  • 21. What do I do for others which may help you? • Seed the ideas; should add a structure to this. NDA. Run SQL queries • Current Case Study – Starting to do the prep work for another real time analytics example, teaching from this – Nick/Mike did this for the other 2 hackathons. • Match the strategy w/structure – Take time off work to build an engineering prototype (Twitter Storm in old slide deck) – Not covering this here – Strategy: first display the data in a real time dashboard then iterate the visualizations, then add DM/ML algorithms after the A/B testing framework is complete
  • 22. One example, real time analytics, web page heat maps
  • 25. Upper Left hand corner
  • 27. Kiehl’s Example • Put in offers w/($ amount, product desc, click url) customized per user, A/B test layouts and placement, store data for customization and measure lift • Measure facebook ads via page rank • Predict missing links application • http://blog.echen.me/2012/07/31/edge-prediction-in- a-social-graph-my-solution-to-facebooks-user- recommendation-contest-on-kaggle/ • Careful, don’t copy. Example only. Generalize to hackathon. Many other ideas • Your answer is different from Yahoo & Google. This isn’t a roadmap.
  • 28. Promotion Modeling • Is this a long tail problem? – How to formulate the graph and influence across nodes? – Which features to select to use for modeling? – Still ok if you don’t have the long tail answer. Follow the Demographics Customer modeling ex. • How to change the model over time? • Metrics for promotion effectiveness – Facebook campaigns are easy to iterate and run. Still need some form of A/B testing
  • 29. Structure has to match Strategy • Partner w/Macy’s? Develop a structure to work with retail partners to increase their sales – E.g. customized shopkick – Don’t just release APIs, release mobile app source code ppl can modify • Test promotions and building profiles? • … lots of ideas