SlideShare una empresa de Scribd logo
1 de 21
Learning To Rank
with Apache Solr and Fusion
Trey Grainger
Chief Algorithms Officer
Andy Liu
Senior Data Scientist
The Problem: Classic Similarity
Keyword search isn’t always enough for relevance
The Problem
User searches for “outdoor rock speaker”
Should see this:Sees this:
The Problem
• Improving search relevance is hard,
• TF-IDF and BM25 are good for text-keyword but what about other
models of relevance?
• Text matching is sometimes not the best solution
• Users don’t always say what they mean
The Solution:
Fusion 4 Signals + Solr 7 LTR
The Solution : Learning to Rank Overview
• Learning to rank lets you pick “features” of a document that
“matter” and teach the machine how to rank a set of items.
• One possible source of ordering is user behavior (i.e. the only clicks were
on the speaker shaped like a rock)
• Solr provides a Learning to Rank implementation.
• Fusion provides a way of capturing user behavior through signals.
The Solution: Learning to Rank Overview
The Solution: Learning to Rank Overview
The Solution: Fusion Signals Overview
The Solution
• Define features (relevancy factors)
• Derive Ground Truth using Fusion’s signals
• Use Solr’s Learning to Rank implementation
Some notes
• Fusion’s normal click boosting is an alternative and pretty good
• It is possible to use them together or one where the other
doesn’t work
• Do other more simple things first, learning to rank without an
adequate schema won’t accomplish much.
Some notes
• Using click signals for ground truth
• Pros:
• Voluminous
• Cheap
• Reflects a captive user’s intent (especially when supplemented with purchase, add to
cart events)
• Tacitly, implicitly labeled data the key to an OOTB “self-learning” system
• Cons
• Noisy
• Potential for reinforcing existing ranking
Putting it together
Building an LTR Pipeline
…but is it better?
• Models compared:
• Solr Out-of-the-box BM25
ranking using textual
features only
• Logistic Regression using all
features except the signals
feature
• Logistic Regression using all
features
Why is it better?
• Summary of Benefits:
• LTR offers automated relevancy tuning
• Using Fusion to implement LTR greatly reduces the time and complexity
required to train and deploy LTR models in production
• Leveraging Fusion’s signals as features in an LTR model offers an easy way of
boosting search relevance performance beyond what is possible using textual
features alone
A/B and experiments
• Do this carefully.
• A/B testing is the safest way to make sure you don’t ruin different user
experiences.
• Stay tuned for a future webinar on Experiments and A/B testing
Where to learn more?
• Grab the technical paper (with step by step instructions):
https://lucidworks.com/ebook/learning-to-rank/
• Grab the code: https://github.com/lucidworks/fusion-ltr-
webinar#fusionsolr-setup
Thank you
Register by Sep 6
to save $200
SEPTEMBER 9-12,
2019 WASHINGTON DC
Check out the site here: https:/ / activate-conf.com/
JOIN ANDY AND TREY AT ACTIVATE
• Productionizing Python ML Models Using Fusion 5, Sanket Shahane,
Andy Liu
• Natural Language Search with Knowledge Graphs, Trey Grainger
• Closing Keynote: The Next Generation of AI-powered Search, Trey
Grainger
AI, ML & DATA SCIENCE TRACK
• Supporting Query Tagging/Suggestion in Fusion 4.2, Uber
• Building a Health QA Chatbot with Solr, Healthwise Incorporated
• Tackling a “Small Data” Search Challenge at Airbnb Experiences,
Airbnb
• Using Deep Learning and Customized Solr Components to Improve
Search Relevancy at Target, Target
THE SEARCH AND AI
CONFERENCE
SEPTEMBER 9-
12,2019 WASHINGTON DC
Check out the site here: https://activate-conf.com/
Register by Sep 6
to save $200
LIVE Q&A:
Enter your questions in the chat box now
for Trey to answer live

Más contenido relacionado

Similar a Learning to Rank with Apache Solr and Fusion

50 Shades of Fail KScope16
50 Shades of Fail KScope1650 Shades of Fail KScope16
50 Shades of Fail KScope16Christian Berg
 
Introduction to Test Driven Development
Introduction to Test Driven DevelopmentIntroduction to Test Driven Development
Introduction to Test Driven DevelopmentSiva Arunachalam
 
Sharpest tool in the box: Choosing the right authoring tool for your learning...
Sharpest tool in the box: Choosing the right authoring tool for your learning...Sharpest tool in the box: Choosing the right authoring tool for your learning...
Sharpest tool in the box: Choosing the right authoring tool for your learning...Brightwave Group
 
Retrieval Performance Bound Analysis for Single Term Queries
Retrieval Performance Bound Analysis for Single Term QueriesRetrieval Performance Bound Analysis for Single Term Queries
Retrieval Performance Bound Analysis for Single Term QueriesTwitter Inc.
 
TDD - Seriously, try it! (updated '22)
TDD - Seriously, try it! (updated '22)TDD - Seriously, try it! (updated '22)
TDD - Seriously, try it! (updated '22)Nacho Cougil
 
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Into...
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Into...Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Into...
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Into...Ortus Solutions, Corp
 
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Adob...
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Adob...Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Adob...
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Adob...Uma Ghotikar
 
Avoiding test hell
Avoiding test hellAvoiding test hell
Avoiding test hellYun Ki Lee
 
Tuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureTuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureDatabricks
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comSimon Hughes
 
Sample_CPT_Presentation-by_Dongwei_Mei.pdf
Sample_CPT_Presentation-by_Dongwei_Mei.pdfSample_CPT_Presentation-by_Dongwei_Mei.pdf
Sample_CPT_Presentation-by_Dongwei_Mei.pdfSURYAPRAKASH281978
 
Mulesoft torronto meetup_16
Mulesoft torronto meetup_16Mulesoft torronto meetup_16
Mulesoft torronto meetup_16Anurag Dwivedi
 
Critical Capabilities to Shifting Left the Right Way
Critical Capabilities to Shifting Left the Right WayCritical Capabilities to Shifting Left the Right Way
Critical Capabilities to Shifting Left the Right WaySmartBear
 
Client Technical Analysis of Legacy Software and Future Replacement
Client Technical Analysis of Legacy Software and Future ReplacementClient Technical Analysis of Legacy Software and Future Replacement
Client Technical Analysis of Legacy Software and Future ReplacementVictorSzoltysek
 
Performance and Abstractions
Performance and AbstractionsPerformance and Abstractions
Performance and AbstractionsMetosin Oy
 
TDD - Seriously, try it! - Trójmiasto Java User Group (17th May '23)
TDD - Seriously, try it! - Trójmiasto Java User Group (17th May '23)TDD - Seriously, try it! - Trójmiasto Java User Group (17th May '23)
TDD - Seriously, try it! - Trójmiasto Java User Group (17th May '23)ssusercaf6c1
 
TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)
TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)
TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)Nacho Cougil
 
Building trust within the organization, first steps towards DevOps
Building trust within the organization, first steps towards DevOpsBuilding trust within the organization, first steps towards DevOps
Building trust within the organization, first steps towards DevOpsGuido Serra
 

Similar a Learning to Rank with Apache Solr and Fusion (20)

50 Shades of Fail KScope16
50 Shades of Fail KScope1650 Shades of Fail KScope16
50 Shades of Fail KScope16
 
Introduction to Test Driven Development
Introduction to Test Driven DevelopmentIntroduction to Test Driven Development
Introduction to Test Driven Development
 
odsc_2023.pdf
odsc_2023.pdfodsc_2023.pdf
odsc_2023.pdf
 
Sharpest tool in the box: Choosing the right authoring tool for your learning...
Sharpest tool in the box: Choosing the right authoring tool for your learning...Sharpest tool in the box: Choosing the right authoring tool for your learning...
Sharpest tool in the box: Choosing the right authoring tool for your learning...
 
Retrieval Performance Bound Analysis for Single Term Queries
Retrieval Performance Bound Analysis for Single Term QueriesRetrieval Performance Bound Analysis for Single Term Queries
Retrieval Performance Bound Analysis for Single Term Queries
 
TDD - Seriously, try it! (updated '22)
TDD - Seriously, try it! (updated '22)TDD - Seriously, try it! (updated '22)
TDD - Seriously, try it! (updated '22)
 
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Into...
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Into...Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Into...
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Into...
 
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Adob...
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Adob...Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Adob...
Introduction to Unit Testing, BDD and Mocking using TestBox & MockBox at Adob...
 
Eurosport's Kodakademi #2
Eurosport's Kodakademi #2Eurosport's Kodakademi #2
Eurosport's Kodakademi #2
 
Avoiding test hell
Avoiding test hellAvoiding test hell
Avoiding test hell
 
Tuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureTuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and Architecture
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
Sample_CPT_Presentation-by_Dongwei_Mei.pdf
Sample_CPT_Presentation-by_Dongwei_Mei.pdfSample_CPT_Presentation-by_Dongwei_Mei.pdf
Sample_CPT_Presentation-by_Dongwei_Mei.pdf
 
Mulesoft torronto meetup_16
Mulesoft torronto meetup_16Mulesoft torronto meetup_16
Mulesoft torronto meetup_16
 
Critical Capabilities to Shifting Left the Right Way
Critical Capabilities to Shifting Left the Right WayCritical Capabilities to Shifting Left the Right Way
Critical Capabilities to Shifting Left the Right Way
 
Client Technical Analysis of Legacy Software and Future Replacement
Client Technical Analysis of Legacy Software and Future ReplacementClient Technical Analysis of Legacy Software and Future Replacement
Client Technical Analysis of Legacy Software and Future Replacement
 
Performance and Abstractions
Performance and AbstractionsPerformance and Abstractions
Performance and Abstractions
 
TDD - Seriously, try it! - Trójmiasto Java User Group (17th May '23)
TDD - Seriously, try it! - Trójmiasto Java User Group (17th May '23)TDD - Seriously, try it! - Trójmiasto Java User Group (17th May '23)
TDD - Seriously, try it! - Trójmiasto Java User Group (17th May '23)
 
TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)
TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)
TDD - Seriously, try it! - Trjjmiasto JUG (17th May '23)
 
Building trust within the organization, first steps towards DevOps
Building trust within the organization, first steps towards DevOpsBuilding trust within the organization, first steps towards DevOps
Building trust within the organization, first steps towards DevOps
 

Más de Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceLucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesLucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchLucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondLucidworks
 

Más de Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Learning to Rank with Apache Solr and Fusion

  • 1. Learning To Rank with Apache Solr and Fusion Trey Grainger Chief Algorithms Officer Andy Liu Senior Data Scientist
  • 2. The Problem: Classic Similarity Keyword search isn’t always enough for relevance
  • 3. The Problem User searches for “outdoor rock speaker” Should see this:Sees this:
  • 4. The Problem • Improving search relevance is hard, • TF-IDF and BM25 are good for text-keyword but what about other models of relevance? • Text matching is sometimes not the best solution • Users don’t always say what they mean
  • 5. The Solution: Fusion 4 Signals + Solr 7 LTR
  • 6. The Solution : Learning to Rank Overview • Learning to rank lets you pick “features” of a document that “matter” and teach the machine how to rank a set of items. • One possible source of ordering is user behavior (i.e. the only clicks were on the speaker shaped like a rock) • Solr provides a Learning to Rank implementation. • Fusion provides a way of capturing user behavior through signals.
  • 7. The Solution: Learning to Rank Overview
  • 8. The Solution: Learning to Rank Overview
  • 9. The Solution: Fusion Signals Overview
  • 10. The Solution • Define features (relevancy factors) • Derive Ground Truth using Fusion’s signals • Use Solr’s Learning to Rank implementation
  • 11. Some notes • Fusion’s normal click boosting is an alternative and pretty good • It is possible to use them together or one where the other doesn’t work • Do other more simple things first, learning to rank without an adequate schema won’t accomplish much.
  • 12. Some notes • Using click signals for ground truth • Pros: • Voluminous • Cheap • Reflects a captive user’s intent (especially when supplemented with purchase, add to cart events) • Tacitly, implicitly labeled data the key to an OOTB “self-learning” system • Cons • Noisy • Potential for reinforcing existing ranking
  • 14. Building an LTR Pipeline
  • 15. …but is it better? • Models compared: • Solr Out-of-the-box BM25 ranking using textual features only • Logistic Regression using all features except the signals feature • Logistic Regression using all features
  • 16. Why is it better? • Summary of Benefits: • LTR offers automated relevancy tuning • Using Fusion to implement LTR greatly reduces the time and complexity required to train and deploy LTR models in production • Leveraging Fusion’s signals as features in an LTR model offers an easy way of boosting search relevance performance beyond what is possible using textual features alone
  • 17. A/B and experiments • Do this carefully. • A/B testing is the safest way to make sure you don’t ruin different user experiences. • Stay tuned for a future webinar on Experiments and A/B testing
  • 18. Where to learn more? • Grab the technical paper (with step by step instructions): https://lucidworks.com/ebook/learning-to-rank/ • Grab the code: https://github.com/lucidworks/fusion-ltr- webinar#fusionsolr-setup
  • 20. Register by Sep 6 to save $200 SEPTEMBER 9-12, 2019 WASHINGTON DC Check out the site here: https:/ / activate-conf.com/ JOIN ANDY AND TREY AT ACTIVATE • Productionizing Python ML Models Using Fusion 5, Sanket Shahane, Andy Liu • Natural Language Search with Knowledge Graphs, Trey Grainger • Closing Keynote: The Next Generation of AI-powered Search, Trey Grainger AI, ML & DATA SCIENCE TRACK • Supporting Query Tagging/Suggestion in Fusion 4.2, Uber • Building a Health QA Chatbot with Solr, Healthwise Incorporated • Tackling a “Small Data” Search Challenge at Airbnb Experiences, Airbnb • Using Deep Learning and Customized Solr Components to Improve Search Relevancy at Target, Target THE SEARCH AND AI CONFERENCE SEPTEMBER 9- 12,2019 WASHINGTON DC Check out the site here: https://activate-conf.com/ Register by Sep 6 to save $200
  • 21. LIVE Q&A: Enter your questions in the chat box now for Trey to answer live