SlideShare una empresa de Scribd logo
1 de 31
Descargar para leer sin conexión
Recruiting SolutionsRecruiting SolutionsRecruiting Solutions
formation Retrieval at LinkedIn
Shakti Sinha Daniel Tunkelang
Head, Search Relevance Head, Query Understanding
Shakti Daniel
Find and be Found:
Why do 225M+ people use LinkedIn?
2
Profile: the professional identity of record.
3
Job recommendations.
4
Publishing platform for professional content.
5
Search helps members find and be found.
6
Search for people,
7
Search for people, jobs,
8
Search for people, jobs, groups, and more.
9
Every search is personalized.
10
Let’s talk a bit about how it all works.
§  Query Understanding
§  Ranking
More at http://data.linkedin.com/search.
11
Query Understanding
12
Daniel Tunkelang
Head, Query Understanding
Pre-retrieval: segment and tag queries.
lucene software engineer
lucene “software engineer”
LinkedIn’s focus: entity-oriented search.
14
Company
Employees
Jobs
Name
Search
Query tagging: key to query understanding.
§  Using human judgments to evaluate tag precision.
–  Extremely accurate (> 99%) for identifying person names.
–  Harder to distinguish company vs. title vs. skill (e.g., oracle dba).
§  Comparing CTR for tag matches vs. non-matches.
–  Difference can be large enough to suggest filtering vs. ranking:
15
Detecting navigational vs. exploratory queries.
Pre-retrieval
§  Sequence of query tags.
Post-retrieval
§  Distribution of scores / features.
16
Click behavior
§  Title searches >50x more
likely to get 2+ clicks than
name searches.
Query expansion for exploratory queries.
17
software patent lawyer
Query expansions derived
from reformulations.
e.g., lawyer -> attorney
Understanding misspelled queries.
18
daniel tankalong infomation retrieval
marisa meyer ingenero eletrico
jonathan podemsky desenista industrail
Did you mean daniel tunkelang?
Did you mean marissa mayer?
Did you mean johnathan podemsky?
Did you mean information retrieval?
Did you mean ingeniero electrico?
Did you mean desenhista industrial?
Spelling out the details.
entity data
people, companies
successful queries
tunkelang =>
reformulations
marisa => marissa
n-grams
dublin => du ub bl li in
metaphones
mark/marc => MRK
word pairs
johnathan podemsky
INDEX
} {marisa meyer yoohoo
marissa
marisa
meyer
mayer
yahoo
yoohoo
19
Ranking
20
Shakti Sinha
Head, Search Relevance
LinkedIn search is personalized.
21
kevin scott
But global factors matter.
22
Relevant results can be in or out of network.
23
§  Searcher’s network matters for relevance.
–  Within network results have higher CTR.
§  But the network is not enough.
–  About two thirds of search clicks come from out of
network results.
Personalized machine-learned ranking.
24
§  Data point is a triple (searcher, query, document).
–  Searcher features are important!
§  Labels: Is this document relevant to the query and
the user?
–  Depends on the user’s network, location, etc.
–  Too much to ask random person to judge.
§  Training data has to be collected from search logs.
Search log data has biases.
25
§  Presentation bias
–  Results shown higher tend to get clicked more often.
–  Use FairPairs [Radlinski and Joachims, AAAI’06].
not flipped
flipped
flipped
Clicked!
✗
✔
✔
✗
✗
✗
training data
Search log data has biases.
26
§  Sample bias
–  User clicks or skips only what is shown.
–  What about low scoring results from existing model?
–  Add low-scoring results as ‘easy negatives’ so model
learns bad results not presented to user.
…
label 0
label 0
label 0
label 0
…
page 1 page 2 page 3 page n
27
How to train your model.
How to train your model.
28
§  Train simple models to resemble complex ones.
–  Build Additive Groves model [Sorokina et al, ECML ’07],
which is good at detecting interactions.
§  Build tree with logistic regression leaves.
§  By restricting tree to user and query features, only
regression model evaluated for each document.
β0 +β1 T(x1)+...+βn xn
α0 +α1 P(x1)+...+αnQ(xn )
X2=?
X10< 0.1234 ?
γ0 +γ1 R(x1)+...+γnQ(xn )
Take-Aways
§  LinkedIn’s search problem is unique because of deep role
of personalization – users are integral part of the corpus.
§  Query understanding allows us to optimize for entity-
oriented search against semi-structured content.
§  Ranking requires us to contextually apply global and
personalized user, query, and document features.
29
Thank you!
30
225,
Want to learn more?
§  Check out http://data.linkedin.com/search.
§  Contact us:
–  Shakti: ssinha@linkedin.com
http://linkedin.com/in/sdsinha
–  Daniel: dtunkelang@linkedin.com
http://linkedin.com/in/dtunkelang
–  Asif: amakhani@linkedin.com
http://linkedin.com/in/asifmakhani
§  Did we mention that we’re hiring?
31

Más contenido relacionado

La actualidad más candente

Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Applied Data Science Part 3: Getting dirty; data preparation and feature crea...Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Applied Data Science Part 3: Getting dirty; data preparation and feature crea...Dataiku
 
Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning Aakash Chotrani
 
Clover Rings Up Digital Growth to Drive Experimentation
Clover Rings Up Digital Growth to Drive ExperimentationClover Rings Up Digital Growth to Drive Experimentation
Clover Rings Up Digital Growth to Drive ExperimentationOptimizely
 
On Page SEO And Off Page SEO
On Page SEO And Off Page SEOOn Page SEO And Off Page SEO
On Page SEO And Off Page SEOReema
 
Cheat sheets for data scientists
Cheat sheets for data scientistsCheat sheets for data scientists
Cheat sheets for data scientistsAjay Ohri
 
Applied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingApplied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingDatabricks
 
Improving Search Relevance in Elasticsearch Using Machine Learning - Milorad ...
Improving Search Relevance in Elasticsearch Using Machine Learning - Milorad ...Improving Search Relevance in Elasticsearch Using Machine Learning - Milorad ...
Improving Search Relevance in Elasticsearch Using Machine Learning - Milorad ...Institute of Contemporary Sciences
 
You Don't Know SEO
You Don't Know SEOYou Don't Know SEO
You Don't Know SEOMichael King
 
Google Search Console: An Ultimate Guide
Google Search Console: An Ultimate GuideGoogle Search Console: An Ultimate Guide
Google Search Console: An Ultimate GuideTyler Horvath
 
What is Seo ?and types of SEO and What it Works
What is Seo ?and types of SEO and What it WorksWhat is Seo ?and types of SEO and What it Works
What is Seo ?and types of SEO and What it WorksAlok Das
 
Defying Logic - Business Logic Testing with Automation
Defying Logic - Business Logic Testing with AutomationDefying Logic - Business Logic Testing with Automation
Defying Logic - Business Logic Testing with AutomationRafal Los
 
Logical Attacks(Vulnerability Research)
Logical Attacks(Vulnerability Research)Logical Attacks(Vulnerability Research)
Logical Attacks(Vulnerability Research)Ajay Negi
 
3. Regression.pdf
3. Regression.pdf3. Regression.pdf
3. Regression.pdfJyoti Yadav
 
How Lazada ranks products to improve customer experience and conversion
How Lazada ranks products to improve customer experience and conversionHow Lazada ranks products to improve customer experience and conversion
How Lazada ranks products to improve customer experience and conversionEugene Yan Ziyou
 
Automating Web Application Security Testing With OWASP ZAP DOT NET API - Tech...
Automating Web Application Security Testing With OWASP ZAP DOT NET API - Tech...Automating Web Application Security Testing With OWASP ZAP DOT NET API - Tech...
Automating Web Application Security Testing With OWASP ZAP DOT NET API - Tech...gmaran23
 
Friends of Search '24 - Scaling SEO_ Lessons for All Types of Sites.pptx
Friends of Search '24 - Scaling SEO_ Lessons for All Types of Sites.pptxFriends of Search '24 - Scaling SEO_ Lessons for All Types of Sites.pptx
Friends of Search '24 - Scaling SEO_ Lessons for All Types of Sites.pptxGregory Edwards
 

La actualidad más candente (20)

Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Applied Data Science Part 3: Getting dirty; data preparation and feature crea...Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
Applied Data Science Part 3: Getting dirty; data preparation and feature crea...
 
Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning
 
Clover Rings Up Digital Growth to Drive Experimentation
Clover Rings Up Digital Growth to Drive ExperimentationClover Rings Up Digital Growth to Drive Experimentation
Clover Rings Up Digital Growth to Drive Experimentation
 
Technical SEO.pdf
Technical SEO.pdfTechnical SEO.pdf
Technical SEO.pdf
 
On Page SEO And Off Page SEO
On Page SEO And Off Page SEOOn Page SEO And Off Page SEO
On Page SEO And Off Page SEO
 
Cheat sheets for data scientists
Cheat sheets for data scientistsCheat sheets for data scientists
Cheat sheets for data scientists
 
Applied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingApplied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce Setting
 
Improving Search Relevance in Elasticsearch Using Machine Learning - Milorad ...
Improving Search Relevance in Elasticsearch Using Machine Learning - Milorad ...Improving Search Relevance in Elasticsearch Using Machine Learning - Milorad ...
Improving Search Relevance in Elasticsearch Using Machine Learning - Milorad ...
 
You Don't Know SEO
You Don't Know SEOYou Don't Know SEO
You Don't Know SEO
 
Google Search Console: An Ultimate Guide
Google Search Console: An Ultimate GuideGoogle Search Console: An Ultimate Guide
Google Search Console: An Ultimate Guide
 
CS8080_IRT_UNIT - III T4 SUPERVISED ALGORITHMS.pdf
CS8080_IRT_UNIT - III T4  SUPERVISED ALGORITHMS.pdfCS8080_IRT_UNIT - III T4  SUPERVISED ALGORITHMS.pdf
CS8080_IRT_UNIT - III T4 SUPERVISED ALGORITHMS.pdf
 
What is Seo ?and types of SEO and What it Works
What is Seo ?and types of SEO and What it WorksWhat is Seo ?and types of SEO and What it Works
What is Seo ?and types of SEO and What it Works
 
Defying Logic - Business Logic Testing with Automation
Defying Logic - Business Logic Testing with AutomationDefying Logic - Business Logic Testing with Automation
Defying Logic - Business Logic Testing with Automation
 
Logical Attacks(Vulnerability Research)
Logical Attacks(Vulnerability Research)Logical Attacks(Vulnerability Research)
Logical Attacks(Vulnerability Research)
 
3. Regression.pdf
3. Regression.pdf3. Regression.pdf
3. Regression.pdf
 
CS8080_IRT_UNIT - III T7 SVM CLASSIFIER.pdf
CS8080_IRT_UNIT - III T7 SVM CLASSIFIER.pdfCS8080_IRT_UNIT - III T7 SVM CLASSIFIER.pdf
CS8080_IRT_UNIT - III T7 SVM CLASSIFIER.pdf
 
How Lazada ranks products to improve customer experience and conversion
How Lazada ranks products to improve customer experience and conversionHow Lazada ranks products to improve customer experience and conversion
How Lazada ranks products to improve customer experience and conversion
 
Automating Web Application Security Testing With OWASP ZAP DOT NET API - Tech...
Automating Web Application Security Testing With OWASP ZAP DOT NET API - Tech...Automating Web Application Security Testing With OWASP ZAP DOT NET API - Tech...
Automating Web Application Security Testing With OWASP ZAP DOT NET API - Tech...
 
Seo & ppc
Seo & ppcSeo & ppc
Seo & ppc
 
Friends of Search '24 - Scaling SEO_ Lessons for All Types of Sites.pptx
Friends of Search '24 - Scaling SEO_ Lessons for All Types of Sites.pptxFriends of Search '24 - Scaling SEO_ Lessons for All Types of Sites.pptx
Friends of Search '24 - Scaling SEO_ Lessons for All Types of Sites.pptx
 

Destacado

Recruiters, Job Seekers and Spammers: Innovations in Job Search at LinkedIn
Recruiters, Job Seekers and Spammers: Innovations in Job Search at LinkedInRecruiters, Job Seekers and Spammers: Innovations in Job Search at LinkedIn
Recruiters, Job Seekers and Spammers: Innovations in Job Search at LinkedInDaria Sorokina
 
Query Understanding: A Manifesto
Query Understanding: A ManifestoQuery Understanding: A Manifesto
Query Understanding: A ManifestoDaniel Tunkelang
 
Social Search in a Professional Context
Social Search in a Professional ContextSocial Search in a Professional Context
Social Search in a Professional ContextDaniel Tunkelang
 
Data Science: A Mindset for Productivity
Data Science: A Mindset for ProductivityData Science: A Mindset for Productivity
Data Science: A Mindset for ProductivityDaniel Tunkelang
 
Web science - How is it different?
Web science - How is it different?Web science - How is it different?
Web science - How is it different?Daniel Tunkelang
 
My Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine LearningMy Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine LearningDaniel Tunkelang
 
Where should you put your data scientists?
Where should you put your data scientists?Where should you put your data scientists?
Where should you put your data scientists?Daniel Tunkelang
 
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016MLconf
 
The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017LinkedIn
 
Design in Tech Report 2017
Design in Tech Report 2017Design in Tech Report 2017
Design in Tech Report 2017John Maeda
 

Destacado (11)

Recruiters, Job Seekers and Spammers: Innovations in Job Search at LinkedIn
Recruiters, Job Seekers and Spammers: Innovations in Job Search at LinkedInRecruiters, Job Seekers and Spammers: Innovations in Job Search at LinkedIn
Recruiters, Job Seekers and Spammers: Innovations in Job Search at LinkedIn
 
Enterprise Intelligence
Enterprise IntelligenceEnterprise Intelligence
Enterprise Intelligence
 
Query Understanding: A Manifesto
Query Understanding: A ManifestoQuery Understanding: A Manifesto
Query Understanding: A Manifesto
 
Social Search in a Professional Context
Social Search in a Professional ContextSocial Search in a Professional Context
Social Search in a Professional Context
 
Data Science: A Mindset for Productivity
Data Science: A Mindset for ProductivityData Science: A Mindset for Productivity
Data Science: A Mindset for Productivity
 
Web science - How is it different?
Web science - How is it different?Web science - How is it different?
Web science - How is it different?
 
My Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine LearningMy Three Ex’s: A Data Science Approach for Applied Machine Learning
My Three Ex’s: A Data Science Approach for Applied Machine Learning
 
Where should you put your data scientists?
Where should you put your data scientists?Where should you put your data scientists?
Where should you put your data scientists?
 
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016
Tanvi Motwani, Lead Data Scientist, Guided Search at A9.com at MLconf ATL 2016
 
The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017The Top Skills That Can Get You Hired in 2017
The Top Skills That Can Get You Hired in 2017
 
Design in Tech Report 2017
Design in Tech Report 2017Design in Tech Report 2017
Design in Tech Report 2017
 

Similar a How LinkedIn's Search Works: Query Understanding and Personalized Ranking

Personalizing Search at LinkedIn
Personalizing Search at LinkedInPersonalizing Search at LinkedIn
Personalizing Search at LinkedInViet Ha-Thuc
 
Keep calm presentation for cipd exhibition 2012
Keep calm presentation for cipd exhibition 2012Keep calm presentation for cipd exhibition 2012
Keep calm presentation for cipd exhibition 2012EasyWebRecruitment
 
smAlbany 2013 power resume_search presentation times union monster
smAlbany 2013 power resume_search presentation  times union monstersmAlbany 2013 power resume_search presentation  times union monster
smAlbany 2013 power resume_search presentation times union monsterLiberteks
 
LinkedIn Basics & Best Practices
LinkedIn Basics & Best Practices LinkedIn Basics & Best Practices
LinkedIn Basics & Best Practices Bruce Bennett
 
LinkedIn Basics and Best Practices July 2018
LinkedIn Basics and Best Practices July 2018LinkedIn Basics and Best Practices July 2018
LinkedIn Basics and Best Practices July 2018Bruce Bennett
 
Personal Brand Exploration I George Stefas
Personal Brand Exploration I George StefasPersonal Brand Exploration I George Stefas
Personal Brand Exploration I George StefasGeorge Stefas
 
Intermediate LinkedIn - November 2018
Intermediate LinkedIn - November 2018Intermediate LinkedIn - November 2018
Intermediate LinkedIn - November 2018Bruce Bennett
 
LinkedIn For Your Job Search
LinkedIn For Your Job SearchLinkedIn For Your Job Search
LinkedIn For Your Job SearchBruce Bennett
 
Linkedin for Danish University Students
Linkedin for Danish University StudentsLinkedin for Danish University Students
Linkedin for Danish University StudentsAndré Bjørn Nielsen
 
Referrals Get Hired - Speach 2013
Referrals Get Hired - Speach 2013Referrals Get Hired - Speach 2013
Referrals Get Hired - Speach 2013Jonathan Duarte
 
LinkedIn Basics and Best Practices
LinkedIn Basics and Best PracticesLinkedIn Basics and Best Practices
LinkedIn Basics and Best PracticesBruce Bennett
 
LinkedIn Basics & Best Practices
LinkedIn Basics & Best Practices LinkedIn Basics & Best Practices
LinkedIn Basics & Best Practices Bruce Bennett
 
LinkedIn for Your Job Search
LinkedIn for Your Job SearchLinkedIn for Your Job Search
LinkedIn for Your Job SearchBruce Bennett
 
Quarterly Product Release Webinar: Q1 Edition
Quarterly Product Release Webinar: Q1 EditionQuarterly Product Release Webinar: Q1 Edition
Quarterly Product Release Webinar: Q1 EditionLinkedIn Talent Solutions
 
New LinkedIn Recruiter Product Enhancements | North America Webcast
New LinkedIn Recruiter Product Enhancements | North America WebcastNew LinkedIn Recruiter Product Enhancements | North America Webcast
New LinkedIn Recruiter Product Enhancements | North America WebcastLinkedIn Talent Solutions
 
The art of intranet search
The art of intranet searchThe art of intranet search
The art of intranet searchSam Marshall
 

Similar a How LinkedIn's Search Works: Query Understanding and Personalized Ranking (20)

Personalizing Search at LinkedIn
Personalizing Search at LinkedInPersonalizing Search at LinkedIn
Personalizing Search at LinkedIn
 
Keep calm presentation for cipd exhibition 2012
Keep calm presentation for cipd exhibition 2012Keep calm presentation for cipd exhibition 2012
Keep calm presentation for cipd exhibition 2012
 
smAlbany 2013 power resume_search presentation times union monster
smAlbany 2013 power resume_search presentation  times union monstersmAlbany 2013 power resume_search presentation  times union monster
smAlbany 2013 power resume_search presentation times union monster
 
LinkedIn Basics & Best Practices
LinkedIn Basics & Best Practices LinkedIn Basics & Best Practices
LinkedIn Basics & Best Practices
 
LinkedIn Basics and Best Practices July 2018
LinkedIn Basics and Best Practices July 2018LinkedIn Basics and Best Practices July 2018
LinkedIn Basics and Best Practices July 2018
 
Personal Brand Exploration I George Stefas
Personal Brand Exploration I George StefasPersonal Brand Exploration I George Stefas
Personal Brand Exploration I George Stefas
 
Questions on sourcing
Questions on sourcingQuestions on sourcing
Questions on sourcing
 
Intermediate LinkedIn - November 2018
Intermediate LinkedIn - November 2018Intermediate LinkedIn - November 2018
Intermediate LinkedIn - November 2018
 
LinkedIn For Your Job Search
LinkedIn For Your Job SearchLinkedIn For Your Job Search
LinkedIn For Your Job Search
 
Linkedin for Danish University Students
Linkedin for Danish University StudentsLinkedin for Danish University Students
Linkedin for Danish University Students
 
Referrals Get Hired - Speach 2013
Referrals Get Hired - Speach 2013Referrals Get Hired - Speach 2013
Referrals Get Hired - Speach 2013
 
LinkedIn Hiring Playbook
LinkedIn Hiring PlaybookLinkedIn Hiring Playbook
LinkedIn Hiring Playbook
 
Smb hiring playbook
Smb hiring playbookSmb hiring playbook
Smb hiring playbook
 
LinkedIn Basics and Best Practices
LinkedIn Basics and Best PracticesLinkedIn Basics and Best Practices
LinkedIn Basics and Best Practices
 
LinkedIn Basics & Best Practices
LinkedIn Basics & Best Practices LinkedIn Basics & Best Practices
LinkedIn Basics & Best Practices
 
LinkedIn for Your Job Search
LinkedIn for Your Job SearchLinkedIn for Your Job Search
LinkedIn for Your Job Search
 
Quarterly Product Release Webinar: Q1 Edition
Quarterly Product Release Webinar: Q1 EditionQuarterly Product Release Webinar: Q1 Edition
Quarterly Product Release Webinar: Q1 Edition
 
New LinkedIn Recruiter Product Enhancements | North America Webcast
New LinkedIn Recruiter Product Enhancements | North America WebcastNew LinkedIn Recruiter Product Enhancements | North America Webcast
New LinkedIn Recruiter Product Enhancements | North America Webcast
 
The art of intranet search
The art of intranet searchThe art of intranet search
The art of intranet search
 
Toronto | ConnectIn 2013
Toronto | ConnectIn 2013Toronto | ConnectIn 2013
Toronto | ConnectIn 2013
 

Más de Daniel Tunkelang

Query Understanding and Ecommerce
Query Understanding and EcommerceQuery Understanding and Ecommerce
Query Understanding and EcommerceDaniel Tunkelang
 
Semantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce QueriesSemantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce QueriesDaniel Tunkelang
 
Helping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query UnderstandingHelping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query UnderstandingDaniel Tunkelang
 
Search as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal JourneySearch as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal JourneyDaniel Tunkelang
 
Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?Daniel Tunkelang
 
Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem Daniel Tunkelang
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data ScientistDaniel Tunkelang
 
Information, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of NeedsInformation, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of NeedsDaniel Tunkelang
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The PeopleDaniel Tunkelang
 
Content, Connections, and Context
Content, Connections, and ContextContent, Connections, and Context
Content, Connections, and ContextDaniel Tunkelang
 
Scale, Structure, and Semantics
Scale, Structure, and SemanticsScale, Structure, and Semantics
Scale, Structure, and SemanticsDaniel Tunkelang
 
Strata 2012: Humans, Machines, and the Dimensions of Microwork
Strata 2012: Humans, Machines, and the Dimensions of MicroworkStrata 2012: Humans, Machines, and the Dimensions of Microwork
Strata 2012: Humans, Machines, and the Dimensions of MicroworkDaniel Tunkelang
 
Recommendations as a Conversation with the User
Recommendations as a Conversation with the UserRecommendations as a Conversation with the User
Recommendations as a Conversation with the UserDaniel Tunkelang
 
Keeping It Professional: Relevance, Recommendations, and Reputation at LinkedIn
Keeping It Professional: Relevance, Recommendations, and Reputation at LinkedInKeeping It Professional: Relevance, Recommendations, and Reputation at LinkedIn
Keeping It Professional: Relevance, Recommendations, and Reputation at LinkedInDaniel Tunkelang
 
The War on Attention Poverty: Measuring Twitter Authority
The War on Attention Poverty: Measuring Twitter AuthorityThe War on Attention Poverty: Measuring Twitter Authority
The War on Attention Poverty: Measuring Twitter AuthorityDaniel Tunkelang
 
Enabling Exploration Through Text Analytics
Enabling Exploration Through Text AnalyticsEnabling Exploration Through Text Analytics
Enabling Exploration Through Text AnalyticsDaniel Tunkelang
 

Más de Daniel Tunkelang (20)

Query Understanding and Ecommerce
Query Understanding and EcommerceQuery Understanding and Ecommerce
Query Understanding and Ecommerce
 
Semantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce QueriesSemantic Equivalence of e-Commerce Queries
Semantic Equivalence of e-Commerce Queries
 
Helping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query UnderstandingHelping Searchers Satisfice through Query Understanding
Helping Searchers Satisfice through Query Understanding
 
MMM, Search!
MMM, Search!MMM, Search!
MMM, Search!
 
Search as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal JourneySearch as Communication: Lessons from a Personal Journey
Search as Communication: Lessons from a Personal Journey
 
Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?Enterprise Search: How do we get there from here?
Enterprise Search: How do we get there from here?
 
Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem Big Data, We Have a Communication Problem
Big Data, We Have a Communication Problem
 
How to Interview a Data Scientist
How to Interview a Data ScientistHow to Interview a Data Scientist
How to Interview a Data Scientist
 
Information, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of NeedsInformation, Attention, and Trust: A Hierarchy of Needs
Information, Attention, and Trust: A Hierarchy of Needs
 
Data By The People, For The People
Data By The People, For The PeopleData By The People, For The People
Data By The People, For The People
 
Content, Connections, and Context
Content, Connections, and ContextContent, Connections, and Context
Content, Connections, and Context
 
Scale, Structure, and Semantics
Scale, Structure, and SemanticsScale, Structure, and Semantics
Scale, Structure, and Semantics
 
Strata 2012: Humans, Machines, and the Dimensions of Microwork
Strata 2012: Humans, Machines, and the Dimensions of MicroworkStrata 2012: Humans, Machines, and the Dimensions of Microwork
Strata 2012: Humans, Machines, and the Dimensions of Microwork
 
Recommendations as a Conversation with the User
Recommendations as a Conversation with the UserRecommendations as a Conversation with the User
Recommendations as a Conversation with the User
 
Keeping It Professional: Relevance, Recommendations, and Reputation at LinkedIn
Keeping It Professional: Relevance, Recommendations, and Reputation at LinkedInKeeping It Professional: Relevance, Recommendations, and Reputation at LinkedIn
Keeping It Professional: Relevance, Recommendations, and Reputation at LinkedIn
 
The War on Attention Poverty: Measuring Twitter Authority
The War on Attention Poverty: Measuring Twitter AuthorityThe War on Attention Poverty: Measuring Twitter Authority
The War on Attention Poverty: Measuring Twitter Authority
 
Design for Interaction
Design for InteractionDesign for Interaction
Design for Interaction
 
Enabling Exploration Through Text Analytics
Enabling Exploration Through Text AnalyticsEnabling Exploration Through Text Analytics
Enabling Exploration Through Text Analytics
 
exploring semantic means
exploring semantic meansexploring semantic means
exploring semantic means
 
Set Retrieval 2.0
Set Retrieval 2.0Set Retrieval 2.0
Set Retrieval 2.0
 

Último

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Último (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

How LinkedIn's Search Works: Query Understanding and Personalized Ranking

  • 1. Recruiting SolutionsRecruiting SolutionsRecruiting Solutions formation Retrieval at LinkedIn Shakti Sinha Daniel Tunkelang Head, Search Relevance Head, Query Understanding Shakti Daniel Find and be Found:
  • 2. Why do 225M+ people use LinkedIn? 2
  • 3. Profile: the professional identity of record. 3
  • 5. Publishing platform for professional content. 5
  • 6. Search helps members find and be found. 6
  • 9. Search for people, jobs, groups, and more. 9
  • 10. Every search is personalized. 10
  • 11. Let’s talk a bit about how it all works. §  Query Understanding §  Ranking More at http://data.linkedin.com/search. 11
  • 13. Pre-retrieval: segment and tag queries. lucene software engineer lucene “software engineer”
  • 14. LinkedIn’s focus: entity-oriented search. 14 Company Employees Jobs Name Search
  • 15. Query tagging: key to query understanding. §  Using human judgments to evaluate tag precision. –  Extremely accurate (> 99%) for identifying person names. –  Harder to distinguish company vs. title vs. skill (e.g., oracle dba). §  Comparing CTR for tag matches vs. non-matches. –  Difference can be large enough to suggest filtering vs. ranking: 15
  • 16. Detecting navigational vs. exploratory queries. Pre-retrieval §  Sequence of query tags. Post-retrieval §  Distribution of scores / features. 16 Click behavior §  Title searches >50x more likely to get 2+ clicks than name searches.
  • 17. Query expansion for exploratory queries. 17 software patent lawyer Query expansions derived from reformulations. e.g., lawyer -> attorney
  • 18. Understanding misspelled queries. 18 daniel tankalong infomation retrieval marisa meyer ingenero eletrico jonathan podemsky desenista industrail Did you mean daniel tunkelang? Did you mean marissa mayer? Did you mean johnathan podemsky? Did you mean information retrieval? Did you mean ingeniero electrico? Did you mean desenhista industrial?
  • 19. Spelling out the details. entity data people, companies successful queries tunkelang => reformulations marisa => marissa n-grams dublin => du ub bl li in metaphones mark/marc => MRK word pairs johnathan podemsky INDEX } {marisa meyer yoohoo marissa marisa meyer mayer yahoo yoohoo 19
  • 21. LinkedIn search is personalized. 21 kevin scott
  • 22. But global factors matter. 22
  • 23. Relevant results can be in or out of network. 23 §  Searcher’s network matters for relevance. –  Within network results have higher CTR. §  But the network is not enough. –  About two thirds of search clicks come from out of network results.
  • 24. Personalized machine-learned ranking. 24 §  Data point is a triple (searcher, query, document). –  Searcher features are important! §  Labels: Is this document relevant to the query and the user? –  Depends on the user’s network, location, etc. –  Too much to ask random person to judge. §  Training data has to be collected from search logs.
  • 25. Search log data has biases. 25 §  Presentation bias –  Results shown higher tend to get clicked more often. –  Use FairPairs [Radlinski and Joachims, AAAI’06]. not flipped flipped flipped Clicked! ✗ ✔ ✔ ✗ ✗ ✗ training data
  • 26. Search log data has biases. 26 §  Sample bias –  User clicks or skips only what is shown. –  What about low scoring results from existing model? –  Add low-scoring results as ‘easy negatives’ so model learns bad results not presented to user. … label 0 label 0 label 0 label 0 … page 1 page 2 page 3 page n
  • 27. 27 How to train your model.
  • 28. How to train your model. 28 §  Train simple models to resemble complex ones. –  Build Additive Groves model [Sorokina et al, ECML ’07], which is good at detecting interactions. §  Build tree with logistic regression leaves. §  By restricting tree to user and query features, only regression model evaluated for each document. β0 +β1 T(x1)+...+βn xn α0 +α1 P(x1)+...+αnQ(xn ) X2=? X10< 0.1234 ? γ0 +γ1 R(x1)+...+γnQ(xn )
  • 29. Take-Aways §  LinkedIn’s search problem is unique because of deep role of personalization – users are integral part of the corpus. §  Query understanding allows us to optimize for entity- oriented search against semi-structured content. §  Ranking requires us to contextually apply global and personalized user, query, and document features. 29
  • 31. Want to learn more? §  Check out http://data.linkedin.com/search. §  Contact us: –  Shakti: ssinha@linkedin.com http://linkedin.com/in/sdsinha –  Daniel: dtunkelang@linkedin.com http://linkedin.com/in/dtunkelang –  Asif: amakhani@linkedin.com http://linkedin.com/in/asifmakhani §  Did we mention that we’re hiring? 31