SlideShare una empresa de Scribd logo
1 de 35
Mario Rodriguez
Christian Posse
Ethan Zhang
MARIO
Multiple Objective Optimization
in Recommender Systems
Motivation
1. Value of a recommender system given by
its multi-faceted utility function:
utility = fn(relevance, engagement, …)
2. We want to efficiently improve the utility of
the system by focusing on the most
promising facet(s)
Outline
• TalentMatch case study
oOverview
oUtility function – Multiple Objectives!
oApproach details
• Problem formulation & Optimization
oA/B test results
Talent
Match
TalentMatch
Job Posting
Member
Profiles Ranked
Talent
Job
Posting
Member
Profile
Job Posting
title
geo
company
industry
description
functional area
…
Candidate
General
expertise
specialties
education
headline
geo
experience
Current Position
title
summary
tenure length
industry
functional area
…
Matching
Transition
probability
Cosine
similarity
…
TalentMatch Model
TalentMatch Teaser Snippet
TalentMatch Utility =
fn(booking rate, email rate, reply rate)
Booking
Rate
TalentMatch Utility =
fn(booking rate, email rate, reply rate)
Booking
Rate
Email
Rate
TalentMatch Utility =
fn(booking rate, email rate, reply rate)
Booking
Rate
Email
Rate
Reply
Rate
TalentMatch Utility =
fn(booking rate, email rate, reply rate)
Booking
Rate
Email
Rate
Reply
Rate
Problem!
TalentMatch Utility =
fn(booking rate, email rate, reply rate)
Booking
Rate
Email
Rate
Reply
Rate
Problem! Job seeker?
Flightmeter: Job Seeker Intent Model
• Propensity Score
o p(switch jobs in next month)
• Model
o Survival Analysis of Positions
o Accelerated failure time (AFT) model
log Ti = Σkβkxik+σεi
ACTIVE
PASSIVE
NON-JOB-
SEEKER
Flightmeter
Feature
Example:
Industry
Attrition Probability
Time
Flightmeter:
actives & passives
16x reply rate on
career-related mail Reply
Rate
What: Increase TalentMatch Utility
fn(booking rate, email rate, reply rate)
Job Posting
title
geo
company
industry
description
functional area
…
Candidate
General
expertise
specialties
education
headline
geo
experience
Current Position
title
summary
tenure length
industry
functional area
…
Matching
Transition
probability
Cosine
similarity
Flightmeter
Flightmeter as Another Feature?
Job Posting
title
geo
company
industry
description
functional area
…
Candidate
General
expertise
specialties
education
headline
geo
experience
Current Position
title
summary
tenure length
industry
functional area
…
Matching
Transition
probability
Cosine
similarity
Flightmeter
Flightmeter as Another Feature?
Match
Score
Histogram
of
12th
Rank
t
Talent Match ranking
Match Score
1, Item X, 0.98, Non-Seeker
2, Item Y, 0.91, Non-Seeker
---------------------------------------
3, Item Z, 0.89, Active
Perturbed ranking
Match Score, Perturbed Score
1, Item X, 0.98, 0.98, Non-Seeker
2, Item Z, 0.89, 0.93, Active
------------------------------------------------
3, Item Y, 0.91, 0.91, Non-Seeker
Perturbation
Function f()
Divergence
Function Δ()
Divergence
score
Objective
Function g()
Objective
score
How: Controlled
Ranking Perturbation Match Score
Distributions
Talent Match ranking
Match Score
1, Item X, 0.98, Non-Seeker
2, Item Y, 0.91, Non-Seeker
---------------------------------------
3, Item Z, 0.89, Active
Perturbed ranking
Match Score, Perturbed Score
1, Item X, 0.98, 0.98, Non-Seeker
2, Item Z, 0.89, 0.93, Active
------------------------------------------------
3, Item Y, 0.91, 0.91, Non-Seeker
Perturbation
Function f()
Divergence
Function Δ()
Divergence
score
Objective
Function g()
Objective
score
How: Controlled
Ranking Perturbation Match Score
Distributions
Talent Match ranking
Match Score
1, Item X, 0.98, Non-Seeker
2, Item Y, 0.91, Non-Seeker
---------------------------------------
3, Item Z, 0.89, Active
Perturbed ranking
Match Score, Perturbed Score
1, Item X, 0.98, 0.98, Non-Seeker
2, Item Z, 0.89, 0.93, Active
------------------------------------------------
3, Item Y, 0.91, 0.91, Non-Seeker
Perturbation
Function f()
Divergence
Function Δ()
Divergence
score
Objective
Function g()
Objective
score
How: Controlled
Ranking Perturbation Match Score
Distributions
Problem Formulation
• Perturbation Function
• Divergence Function
• Objective Function
Problem Formulation
• Perturbation Function
• Divergence Function
• Objective Function
TalentMatch Score
Finding a Good Perturbation Function
• Loss Function
• Objective and divergence depend on a sort/rank,
so gradient-based optimization not directly
applicable
• Lambda value?
Pareto
Optimization
Pareto
Optimization
Match Score Histogram Divergence
0 27
54 100
Computational Approaches
• Grid Search
• Gradient-based techniques
Gradient-based Techniques
λ = 0.076
…0.076 > λ > 0
Gradient-based Techniques, cont.
• Smooth approximations to popular ranking metrics
amenable to gradient-descent
o Normalized Discounted Cumulative Gain (NDCG)
o Average Precision (AP)
• Re-frame the Multi-Objective Optimization problem
using those approximations, and apply SmoothRank
Experiments
• A/B Test
o Treatment 1: 1.15 boost (8/12)
o Treatment 2: 1.07 boost (6/12)
o Control: 1.0 boost (4/12)
• Expectations
o 50% increase in reply rate for 1.07 boost
o 100% increase in reply rate for 1.15 boost
o Expected booking rate and email rate to remain
unchanged or minimally affected
A/B Test Results
(% increase over control)
Booking rate
α = β = 1.07 0%
α = β = 1.15 -0.4%
Email rate
α = β = 1.07 31%
α = β = 1.15 25%
Reply rate
α = β = 1.07 22%
α = β = 1.15 42%
Conclusion
• Consider the multiple facets of your system’s
utility function to improve utility efficiently
o Handle competing objectives carefully
• Know your tradeoff(s)!
o A/B test furiously
2 4
8
17
32
55
90
2004 2005 2006 2007 2008 2009 2010 2011
LinkedIn Members (Millions)
175M+
25thMost visit website worldwide
(Comscore 6-12)
Company pages
>2M
62% non U.S.
2/sec
85%
Fortune 500 Companies use
LinkedIn to hire
Thank
You!
We’re
Hiring!
mrodriguez@linkedin.com

Más contenido relacionado

Similar a RecSys 2012 Dublin Conference Slides - Multiple Objective Optimization in Recommender Systems

Sales territory optimization with genetic algorithm
Sales territory optimization with genetic algorithmSales territory optimization with genetic algorithm
Sales territory optimization with genetic algorithmYifan Wang
 
1시간만에 머신러닝 개념 따라 잡기
1시간만에 머신러닝 개념 따라 잡기1시간만에 머신러닝 개념 따라 잡기
1시간만에 머신러닝 개념 따라 잡기Sungmin Kim
 
SMART Seminar Series: "Optimisation of closed loop supply chain decisions usi...
SMART Seminar Series: "Optimisation of closed loop supply chain decisions usi...SMART Seminar Series: "Optimisation of closed loop supply chain decisions usi...
SMART Seminar Series: "Optimisation of closed loop supply chain decisions usi...SMART Infrastructure Facility
 
How to Reduce Turnover in 180 Days
How to Reduce Turnover in 180 DaysHow to Reduce Turnover in 180 Days
How to Reduce Turnover in 180 DaysLifeThrive
 
Haystack- Learning to rank in an hourly job market
Haystack- Learning to rank in an hourly job market Haystack- Learning to rank in an hourly job market
Haystack- Learning to rank in an hourly job market Xun Wang
 
Reinforcement Learning : A Beginners Tutorial
Reinforcement Learning : A Beginners TutorialReinforcement Learning : A Beginners Tutorial
Reinforcement Learning : A Beginners TutorialOmar Enayet
 
Is 581 milestone 1 and 2 case study coastline systems consulting
Is 581 milestone 1 and 2 case study coastline systems consultingIs 581 milestone 1 and 2 case study coastline systems consulting
Is 581 milestone 1 and 2 case study coastline systems consultingsivakumar4841
 
How to Build your Training Set for a Learning To Rank Project - Haystack
How to Build your Training Set for a Learning To Rank Project - HaystackHow to Build your Training Set for a Learning To Rank Project - Haystack
How to Build your Training Set for a Learning To Rank Project - HaystackSease
 
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...PyData
 
An efficient use of temporal difference technique in Computer Game Learning
An efficient use of temporal difference technique in Computer Game LearningAn efficient use of temporal difference technique in Computer Game Learning
An efficient use of temporal difference technique in Computer Game LearningPrabhu Kumar
 
RecSys Challenge 2015: ensemble learning with categorical features
RecSys Challenge 2015: ensemble learning with categorical featuresRecSys Challenge 2015: ensemble learning with categorical features
RecSys Challenge 2015: ensemble learning with categorical featuresromovpa
 
Dynamic Programming and Reinforcement Learning applied to Tetris Game
Dynamic Programming and Reinforcement Learning applied to Tetris GameDynamic Programming and Reinforcement Learning applied to Tetris Game
Dynamic Programming and Reinforcement Learning applied to Tetris GameSuelen Carvalho
 
SA1: How to use Mechanical Turk for Behavioral Research
SA1: How to use Mechanical Turk for Behavioral ResearchSA1: How to use Mechanical Turk for Behavioral Research
SA1: How to use Mechanical Turk for Behavioral ResearchJohn Breslin
 
Consumer Analytics in Real Time: InfoScout and Mechanical Turk (BDT206) | AWS...
Consumer Analytics in Real Time: InfoScout and Mechanical Turk (BDT206) | AWS...Consumer Analytics in Real Time: InfoScout and Mechanical Turk (BDT206) | AWS...
Consumer Analytics in Real Time: InfoScout and Mechanical Turk (BDT206) | AWS...Amazon Web Services
 
CM CH 2.pptx
CM CH 2.pptxCM CH 2.pptx
CM CH 2.pptxDejeneDay
 

Similar a RecSys 2012 Dublin Conference Slides - Multiple Objective Optimization in Recommender Systems (20)

Sales territory optimization with genetic algorithm
Sales territory optimization with genetic algorithmSales territory optimization with genetic algorithm
Sales territory optimization with genetic algorithm
 
1시간만에 머신러닝 개념 따라 잡기
1시간만에 머신러닝 개념 따라 잡기1시간만에 머신러닝 개념 따라 잡기
1시간만에 머신러닝 개념 따라 잡기
 
SMART Seminar Series: "Optimisation of closed loop supply chain decisions usi...
SMART Seminar Series: "Optimisation of closed loop supply chain decisions usi...SMART Seminar Series: "Optimisation of closed loop supply chain decisions usi...
SMART Seminar Series: "Optimisation of closed loop supply chain decisions usi...
 
How to Reduce Turnover in 180 Days
How to Reduce Turnover in 180 DaysHow to Reduce Turnover in 180 Days
How to Reduce Turnover in 180 Days
 
M comp
M compM comp
M comp
 
Haystack- Learning to rank in an hourly job market
Haystack- Learning to rank in an hourly job market Haystack- Learning to rank in an hourly job market
Haystack- Learning to rank in an hourly job market
 
scrib.pptx
scrib.pptxscrib.pptx
scrib.pptx
 
Reinforcement Learning : A Beginners Tutorial
Reinforcement Learning : A Beginners TutorialReinforcement Learning : A Beginners Tutorial
Reinforcement Learning : A Beginners Tutorial
 
Is 581 milestone 1 and 2 case study coastline systems consulting
Is 581 milestone 1 and 2 case study coastline systems consultingIs 581 milestone 1 and 2 case study coastline systems consulting
Is 581 milestone 1 and 2 case study coastline systems consulting
 
How to Build your Training Set for a Learning To Rank Project - Haystack
How to Build your Training Set for a Learning To Rank Project - HaystackHow to Build your Training Set for a Learning To Rank Project - Haystack
How to Build your Training Set for a Learning To Rank Project - Haystack
 
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
 
An efficient use of temporal difference technique in Computer Game Learning
An efficient use of temporal difference technique in Computer Game LearningAn efficient use of temporal difference technique in Computer Game Learning
An efficient use of temporal difference technique in Computer Game Learning
 
RecSys Challenge 2015: ensemble learning with categorical features
RecSys Challenge 2015: ensemble learning with categorical featuresRecSys Challenge 2015: ensemble learning with categorical features
RecSys Challenge 2015: ensemble learning with categorical features
 
Recruitment Plan Template
Recruitment Plan TemplateRecruitment Plan Template
Recruitment Plan Template
 
Dynamic Programming and Reinforcement Learning applied to Tetris Game
Dynamic Programming and Reinforcement Learning applied to Tetris GameDynamic Programming and Reinforcement Learning applied to Tetris Game
Dynamic Programming and Reinforcement Learning applied to Tetris Game
 
SA1: How to use Mechanical Turk for Behavioral Research
SA1: How to use Mechanical Turk for Behavioral ResearchSA1: How to use Mechanical Turk for Behavioral Research
SA1: How to use Mechanical Turk for Behavioral Research
 
Consumer Analytics in Real Time: InfoScout and Mechanical Turk (BDT206) | AWS...
Consumer Analytics in Real Time: InfoScout and Mechanical Turk (BDT206) | AWS...Consumer Analytics in Real Time: InfoScout and Mechanical Turk (BDT206) | AWS...
Consumer Analytics in Real Time: InfoScout and Mechanical Turk (BDT206) | AWS...
 
kdd2015
kdd2015kdd2015
kdd2015
 
CM CH 2.pptx
CM CH 2.pptxCM CH 2.pptx
CM CH 2.pptx
 
Job evaluation-ppt
Job evaluation-ppt Job evaluation-ppt
Job evaluation-ppt
 

Último

Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 

Último (20)

Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 

RecSys 2012 Dublin Conference Slides - Multiple Objective Optimization in Recommender Systems

Notas del editor

  1. Hi, my name is Mario Rodriguez and I’m here to talk about work that my colleagues and I are doing at LinkedIn to improve our recommender systems, specifically, in area of multiple objective optimization.
  2. Here is the motivation for our work: the performance of a recommender system is given by its utility function, and this utility function is often multi-faceted. This talk is about improving multi-faceted utility functions in a way that focuses on the most promising facets. And by efficiency, we mean something that relates to ROI, or the best bang for your buck. Generally, since the utility function is hard to tackle directly, what is done is that it is broken down into subcategories, possibly recursively, until we arrive at something concrete that we can work with.
  3. Here is a quick outline of the talk. We will do a deep dive on one specific case study, a system called TalentMatch, which is a revenue-generating recommender system at LinkedIn that recommends talent to job posters, and which will help illustrate the details of our approach. Though we are discussing a specific use case, our approach is broadly applicable, and is being used in a variety of products at LinkedIn. So, we are going to give a brief overview of Talent Match. We will discuss its utility function, which is multi-faceted. And then we will show how we improved its utility, with evidence from real A/B test results.
  4. So, here is high level overview of Talent Match. Someone comes to the site and posts a job. We then scour the entire member database looking for the members who best match that job, and we recommend a ranked list of those members to the job poster.
  5. Just to add some context, here’s what a job posting looks like. A job posting is very rich in content, and though it is not completely structured, parsing it is not too hard. Some obvious fields include the title, the description, the skills, and the region.
  6. And here’s what a member profile looks like. We also know very detailed data about our members, including title and description of their current job, how long they’ve been there, their skills, their current region… and we can match those attributes to the respective attributes of the job posting.
  7. This is how we do this matching. We combine the job and the candidate into a single feature vector, where each feature denotes various similarity measures between attributes of the job and attributes of the job poster, and then we find the relative importance of these features using a supervised learning method like logistic regression trained on a click signal such as job applications. This gives us a model that knows how to differentiate good job-member pairs from bad job-member pairs.
  8. So, once someone posts a job, and we run our TalentMatch model on our member database, this is the snippet of results we produce for the job poster. At this point there is no identifiable information, and so if the job poster likes what he sees enough, he or she has the option to purchase the result set to be able to see their full profile and contact those members.
  9. Let’s go over the facets of the utility function of the Talent Match system. First, the snippet needs to be good enough to convince the job poster to purchase the recommendations. That’s the booking rate. Then, once purchased, the job poster gets to look at the full profile of the candidate recommended and decides whether or not they are indeed a good match for the job. If the candidate is a good match, the job poster may then decide to email the candidate regarding the job opportunity. That’s the email rate. Finally, if the candidate is interested, then the candidate will reply positively to the job poster. Giving us the reply rate. Now that the link is established, they can take it from there. But from our perspective, these 3 steps are required for there to be relevant engagement within this system.Out of the 3 facets of the utility function, the reply rate was identified as needing improvement. Job posters were complaining the they were emailing candidates, but the candidates were not replying enough. This was the problem we needed to solve. We figured the booking rate and the email rate were well accounted for by the existing TalentMatch model, but even if someone is a great match for the job, that does not mean they are going to reply. So, we thought that maybe people were not replying because they were probably not looking for a job. What if we could determine if someone was a job seeker, and then include more of those people in the recommendations?
  10. So, we had already developed a model that computes the job seeking propensity for each member. It turns out that many people who are open to new opportunities, do not self-identify as job seekers, so this model helps us identify those people. You can think of the job seeking propensity as the probability that the member will switch positions in the next month. We also output a segmentation of this probability into actives, passives, and non-job-seekers, and we consider actives and passives to have a high job seeking intent.This job seeking intent model is completely different from the TalentMatch model. It is a survival model where the entity whose survival we’re analyzing is a job, or more specifically, a position. Based on data derived from the lifetime of millions of positions, we model the duration of a position as a function of various features in what is known as an accelerated failure time model, and this allows us to compute the probability that a given position will end within the next time period.
  11. There are many signals the we can use to compute the job seeking intent. We may have the user’s job seeking activity on the site: are they searching or applying for jobs. Those are obvious signals. But we have others. For example, we know that different industries have different attrition rates. This plot includes a few representative industries and their survival curves. The survival curve gives the probability that someone will still be at their position X months down the road if they start that position today.These are survival curves for a few of the most extreme industries, some of the most hazardous including “political organization” and “animation” and some of the least hazardous including “alternative medicine” and “ranching”. In the “political organization” industry, which is the red line at the bottom, more than 50% of people don’t last 2 years in a given position.
  12. So, Intuitively, it makes sense to suggest users who are job seekers in TalentMatch. But we confirmed our intuition, we ran the numbers, and saw that users with a high job seeking intent (actives and passives) have a much higher rate of reply to career related emails when compared to non-job-seekers (16 times the reply rate). And this is exactly the facet of the utility function of TalentMatch that we are interested in improving. So, what we want to do is incorporate the job seeker intent into the TalentMatch model, and we want to do so without negatively affecting the booking rate and the email rate.
  13. So, could we just add the job seeker propensity score as a feature into the talent match model and retrain? Actually, doing that is not a good idea. Talent Match learns the concept of whether or not the member is a good match for the job, and a member-job pair does not become a better or worse match as a function of the job seeking propensity of the member. So, we need something else…
  14. So, how do we do it? Well, we can look at the talent match score distribution of the top-K recommendations, and treat that as a kind of ground truth from which we cannot deviate too much. Basically, having optimized for matching in the talent match model, we want to perturb the ranking slightly to gain as much as possible in other metrics, but without sacrificing the quality of the match. This is the talent match score distribution of the 12th recommendation. We care about the 12th recommendation because that’s how many results we show on a single page. The x axis goes from some threshold T (below which items do not get recommended) to 1.0, and we see that there is a peak around 1.0, suggesting that even at the 12th position, the bulk of the recommendations are of very high quality. This high quality is the essence of the system, and whatever enhancements we do to it, we don’t want loose this quality.
  15. So, what we want is a controlled perturbation of the ranking output by the talent match model, and this is how we are gonna do it: given the talent match ranking, we run a perturbation function on it that generates another ranking, the perturbed ranking, which optimizes for a metric we’re interested in (in the case of TalentMatch, it’s number of users with high-job seeking intent in the top-12 recommendations). Given the 2 rankings and their distribution of match scores, we can compute the distance between them using a variety of metrics, for example KL divergence or Euclidean distance. This divergence score is what will help us to make sure we are not negatively affecting the quality of the recommendations. Notice how, in the perturbed ranking, item Z was bumped from its original third position, below the cutoff line, to the second position, and so whereas before we had 2 non-seekers above the cutoff, meaning they would be recommended, now we have a non-seeker and an active. Also notice, that the perturbation is minimal. We should feel comfortable bumping item Z to the second position, but not to the first position.There are then 3 functions that we need to define: the perturbation function, the divergence function, and the objective function. The parameters of the perturbation function is what we will be estimating based the performance established by the divergence and objective measures: we want high scores on the objective and low scores on the divergence.
  16. Here is theinstantiation of those functions for the TalentMatch case. The perturbation function simply applies a small boost to the match score, denoted by the letter “y”, and we allow that boost to be different for active and passive job seekers (as denoted by the alpha and the beta parameters). The divergence function is simply the Euclidean distance between the distribution of scores in the talent match ranking and the distribution of scores in the perturbed ranking. This is simply a measure of how match quality was affected (a divergence score of 0 means that the quality of the matches remained unaffected). The objective is the average number of actives and passives in the top-12.
  17. To find a good perturbation function, we can construct a typical loss function, where the effect of the divergence is governed by a regularization parameter lambda, and then optimize this loss function to find the parameters of the perturbation function, alpha and beta, which correspond respectively to the boost of active and passive job seekers. However, there is a complicating factor: both the divergence and objective functions depend on a ranking, which depends on a sorting operation, and therefore, traditional gradient based approaches are not readily applicable. Also, what should we set lambda to? We don’t just want to use the lambda that generates the lowest loss, we are actually more interested in what our options are regarding what our tradeoff is going to be between the objective and the divergence function.
  18. We will discuss computational strategies for optimizing the perturbation function in a moment, but before that, we need to discuss the kind of optimization we are actually interested in. What we really want is Pareto optimization, where there is not one optimal solution, instead, there are some solutions which are better in one objective, while other solutions are better in others. In this plot, we have the objective, the average number of actives and passives in the top-12 results, on the y-axis, and the divergence on the x-axis. The original ranking has on average 4 actives and passives in the top-12, as shown in the table in the top left corner. Also, by definition, the divergence of this original ranking, is 0. Each point (or bubble) in the plot represents a specific assignment to the parameters of the perturbation function: alpha and beta. We see on the plot that the only way to increase the objective on the y-axis, is to also allow an increase on the divergence on the x-axis. We also see that for a given divergence, say 50, there are many assignments of alpha and beta with that divergence, with varying scores on the objective. We want the maximum objective for each divergence, and those are the points in the pareto frontier, which are the red points in the plot. So, no matter what divergence you allow, you should pick a point on the pareto frontier. Back to the table of sample plans, we see that if we set alpha and beta to 1.15, we can double the the number of actives and passives in the top-12 (from 4 to 8) while paying the cost of having a divergence of 64, and that this is a point in the pareto frontier.
  19. Here we can get a better idea of what the divergence scores actually mean, the top left has the distribution of the original, unperturbed model, and as we move across the quadrant, we see how the divergence increases (0, 27, 54, and 100). In the top left histogram, we see the bump around the 0.9’s, and with each histogram, the bump is gradually attenuated, until there is no more bump in the bottom right. So, we would probably be willing to accept a divergence in the 50-60’s range (as shown in the bottom left), but not in the 100’s, which is what’s shown in the bottom right.
  20. So, how will we actually learn the weights in the perturbation function, in other words, the right values for alpha and beta? There are several different ways of doing so, and the most appropriate varies with the specific use case. Grid search, for one, is very easy to implement: simply generate all possible solutions (up to some discretization amount) and evaluate them. This is feasible for small search spaces, but quickly become unwieldy due to the combinatorial explosion as a result of a large number of parameters.Gradient-based techniques would be another approach, and this would be useful in a scenario where the perturbation function has a high number of parameters and grid search is unfeasible. We mentioned earlier that there is an issue with our objective and divergence functions which make gradient optimization hard, and we will see how to overcome that.
  21. Assuming the objective and divergence functions are amenable to gradient-based optimization techniques, they are typically scalarized into a singe loss function, and then optimized for a given value of the lambda tradeoff parameter. The Pareto frontier can be approximated in many cases with an approach like this by optimizing the loss function for several values of lambda, with a couple of them being represented by the lines with different slopes in the diagram. The tangent of the loss with the Pareto frontier represents an optimal solution for a given value of lambda.
  22. We mentioned earlier that our objective and constraint functions are not smooth, since they depend on a sort, and so they’re not readily amenable to gradient-based methods. However, this is a problem that has been looked at in the field of information retrieval, where they’ve come up with methods for “learning to rank”. There’s been research on approximations to popular rank-based metrics such as the normalized discounted cumulative gain (NDCG) and the average precision (AP) which are amenable to gradient descent. We can leverage this work and frame our optimization problem using those metrics, where our objective function takes the shape of the approximate AP, and our divergence function takes the shape of the approximate NDCG. The approximate NDCG is highest when we’ve ranked higher the candidates with the highest match score, which a property we can exploit to constraint the perturbed model. There are more ways to do this, which I discuss in the paper, but once we frame our TalentMatch problem this way, we can then apply gradient descent to optimize alpha and beta for several values of lambda.
  23. Given that we only had 2 parameters in our perturbation function, grid search was a satisfactory approach and so that’s what we used. When you have a set of pareto optimal values, typically what it’s done is that you look for the proverbial knee of the curve, a point after which you have to pay too much in one objective to get increases in another, and our curve actually displays this characteristic: the Pareto tradeoff is constant up to a divergence of about 60, which as we saw earlier in the histogram slide, was not too bad. Still we did not know exactly what a given divergence would do to the booking and email rate, so we picked a couple of values to A/B test. We picked the maximum value on that line, the one at the knee, and a point in the middle, which corresponded to a boost of 1.15 and 1.07 respectively.So, what did we expect from the tests? Since we knew the rate of reply to career-related emails of users with high-job seeking intent, as well as the expected proportion of those users in the top-12 recommendations, it was easy to get a ball park figure of how much of an increase in reply rate we would obtain: we expected a 50% increase over control for the 1.07 treatment and a 100% increase over control for the 1.15 treatment. Regarding the other 2 facets of the utility function, the booking rate and the email rate of job posters to candidates, what we hoped was that they would remain unchanged or only be minimally affected.
  24. So, how did we do? Let’s see how facet of the utility was affected. The booking rate remained mostly unchanged, with possibly a very slight dip of 0.4% on the 1.15 treatment. The email rate, to our surprise actually increased in both treatments. This tells us that somehow, the profiles of users with high-job seeking intent were more appealing to job posters than those who weren’t. Specifically what about their profile was more appealing is something we have yet to look into. This also tells us that maybe the snippets that we show job posters were not a great representation of the value for them, and that perhaps better snippets would lead to higher booking rates. Finally, we see that we were able to increase the reply rate, which is what we had originally set out to do, and that the increase for the 1.15 treatment was double that of the 1.07 treatment: 42% and 22%, which was in line with our expectations. Now, these numbers are pretty good, but why weren’t they as high as we had expected? Well, we had thought that job posters contacted all the recommendations, since it did not cost them more to contact all than to contact one, but as we observed in the email rate, which we were able to improve, job posters do not, in fact, contact all of the recommendations.
  25. So, in conclusion, I’d like to present you with 2 main takeaways:First, recommender systems often have a multifaceted utility function, of which matching is not just a big part, it is the crucial component, the secret sauce of the system. As you optimize for additional objectives, know exactly how the quality of the matches becomes affected, and justify any sacrifices to it. You don’t want kill the goose for its golden eggs. On the other hand, if you are just focusing on matching, then you are running a suboptimal system.We have presented a way to handle competing objectives, in case they surface as part of improving the utility. Listen to user feedback, which is how we found out that the reply rate was indeed the area to focus on in talentmatch. The users told us that the quality of the matches was great, but they wanted the users to engage with them. A/B test furiously. We have tons of examples of theory not meeting practice, offline not meeting online. Make sure you get a reality check.