SlideShare a Scribd company logo
1 of 14
Generative Pseudo Labeling
윤용선
1
0. Paper
• GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval
• Authors: Kexin Wang, Nandan Thakur, Nils Reimers, Iryna Gurevych
• Published: 2021.12 (Arxiv)
• https://arxiv.org/abs/2112.07577
2
0. Preliminaries
• Information Retrieval
• Query와 관련이 있는 문서를 찾는 작업 (관련이 있는 = 대답할 수 있는)
• Open-domain QA: IR + MRC
• Method: 쿼리와 가장 높은 Score(Similarity) 를 갖는 문서 선택
• Sparse embedding vs Dense embedding
• Keyword/고유명사는 sparse, Synonym/Paraphrase는 dense
3
0. Preliminaries
- 빠른 검색 (Maximum
Inner Project Search)
- 아쉬운 성능
- 좋은 성능
- 엄청 느림
Retriever -> Reranker -> Reader
4
0. Preliminaries
5
1. Introduction
• Recently, information retrieval methods based on dense vector spaces have become popular to
address the limitation of sparse vector.
• Dense retrieval methods require large amounts of training data to work well.
• Dense retrieval methods are extremely sensitive to domain shifts.
• Models trained on MS MARCO perform rather poorly for questions for COVID-19 scientific
literatures.
• Models did not learn how to represent this topic well in a vector space.
• We present Generative Pseudo Labeling (GPL), an unsupervised domain adaptation for dense
retrieval models.
6
2. Method
• For a given target corpus, we generate for each passage three queries using T5-encoder-decoder
model.
• For each of the generated queries, we use an existing retrieval system to retrieve 50 negative
passages.
• For each (query, positive, negative) – tuple we compute the margin score using cross-encoder.
• Train the bi-encoder with margin score.
7
2. Method
• Multiple Negative Ranking loss considers only the coarse relationship between queries and
passages., i.e. the matching passage is considered as relevant while all other passages are
considered irrelevant.
• However, the query generator might generate queries that are not answerable by the passage.
Further, other passages might actually be relevant as well for a given query.
• MarginMSE loss uses a powerful cross-encoder to soft-label (query, passage) pairs. It then teaches
the dense retriever to mimic the score margin between the positive and negative query-passage
pairs.
In GPL,
- Bad query -> low pos score -> distant
- False negative -> high neg score -> similar
MarginMSE Loss
8
3. Experiments
• Query generator: docT5query
• Negative miner(Retriever): msmarco-distilbert-base-v3, msmarco-MiniLM-L-6-v3
• 50 negatives using each retriever and uniformly sample
• Cross encoder: msmarco-MiniLM-L-6-v2
• Student: MS MARCO DistilBERT + Mean pooling + Dot product
• 140k training steps, 32 batch size (No need of large batch size!)
Experimental Setup
9
3. Experiments
• Six domain-specific text retrieval tasks from the BeIR benchmark
• Evaluation is done using nDCG@10
• 더 관련있는 문서를 더 높은 순위로 예측하자!
Evaluation
• Zero-Shot
• MS MARCO: distil-bert dense retrieval trained with MarginMSE
• BM25: lexical matching from Elastic search
• Pre-Training based Domain Adaptation
• SimCSE: encode same sent with different dropout masks + MNRL loss
• ICT: sample one sent from passage as the pseudo query
• TSDAE: denoising autoencoder
• Generation-based Domain Adaptation
• Qgen: generated query + Multiple Negative Ranking loss
Baselines
10
4. Results
11
5. Analysis
• GPL begins to be saturated after around 100K steps.
• With TSDAE pre-training, the performance can be improved consistently.
Influence of Training Steps
Influence of Corpus Size
• We find with more than 10K passages, GPL can already outperform the zero-shot baseline
12
5. Analysis
• Generating 3 queries per passages appears to be optimal, generating more queries per passages
does not yield further improvements.
Robustness against Query Generation
Sensitivity to Starting Checkpoints
• We also evaluate to directly fine-tune a distilbert-model using QGen
13
6. Conclusion
• In this work we propose GPL, a novel unsupervised domain adaptation method
for dense retrieval models.
• Pseudo-labeling overcomes two important shortcomings of previous methods.
• Not all generated queries are of high quality
• Training with mined hard negatives can be noised
• We observe GPL performs well on all the datasets and significantly outperforms
other approaches.
• As a limitation, GPL requires a relatively complex training setup and future work
can focus on simplify this training pipeline.
14

More Related Content

Similar to tmptmptmp123.pptx

Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Lucidworks
 
Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Dalei Li
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsSanghamitra Deb
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?Tuan Yang
 
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural NetworksJunho Cho
 
Presentation
PresentationPresentation
PresentationAkul1501
 
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...Soheila Dehghanzadeh
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for BeginnersSanghamitra Deb
 
Predicting SPARQL query execution time and suggesting SPARQL queries based on...
Predicting SPARQL query execution time and suggesting SPARQL queries based on...Predicting SPARQL query execution time and suggesting SPARQL queries based on...
Predicting SPARQL query execution time and suggesting SPARQL queries based on...Rakebul Hasan
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural NetworksDatabricks
 
Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSearch to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSungchul Kim
 
Benchmarking NGINX for Accuracy and Results
Benchmarking NGINX for Accuracy and ResultsBenchmarking NGINX for Accuracy and Results
Benchmarking NGINX for Accuracy and ResultsNGINX, Inc.
 
Studies of HPCC Systems from Machine Learning Perspectives
Studies of HPCC Systems from Machine Learning PerspectivesStudies of HPCC Systems from Machine Learning Perspectives
Studies of HPCC Systems from Machine Learning PerspectivesHPCC Systems
 
Elasticsearch Sharding Strategy at Tubular Labs
Elasticsearch Sharding Strategy at Tubular LabsElasticsearch Sharding Strategy at Tubular Labs
Elasticsearch Sharding Strategy at Tubular LabsTubular Labs
 
Solr Black Belt Pre-conference
Solr Black Belt Pre-conferenceSolr Black Belt Pre-conference
Solr Black Belt Pre-conferenceErik Hatcher
 
NLP_Presentation
NLP_PresentationNLP_Presentation
NLP_PresentationAravind700
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in JavaRuben Badaró
 
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopAyon Sinha
 
Making powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysisMaking powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysisAdamCribbs1
 
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...PyData
 

Similar to tmptmptmp123.pptx (20)

Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
 
Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?
 
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
 
Presentation
PresentationPresentation
Presentation
 
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
Predicting SPARQL query execution time and suggesting SPARQL queries based on...
Predicting SPARQL query execution time and suggesting SPARQL queries based on...Predicting SPARQL query execution time and suggesting SPARQL queries based on...
Predicting SPARQL query execution time and suggesting SPARQL queries based on...
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Search to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the EyesSearch to Distill: Pearls are Everywhere but not the Eyes
Search to Distill: Pearls are Everywhere but not the Eyes
 
Benchmarking NGINX for Accuracy and Results
Benchmarking NGINX for Accuracy and ResultsBenchmarking NGINX for Accuracy and Results
Benchmarking NGINX for Accuracy and Results
 
Studies of HPCC Systems from Machine Learning Perspectives
Studies of HPCC Systems from Machine Learning PerspectivesStudies of HPCC Systems from Machine Learning Perspectives
Studies of HPCC Systems from Machine Learning Perspectives
 
Elasticsearch Sharding Strategy at Tubular Labs
Elasticsearch Sharding Strategy at Tubular LabsElasticsearch Sharding Strategy at Tubular Labs
Elasticsearch Sharding Strategy at Tubular Labs
 
Solr Black Belt Pre-conference
Solr Black Belt Pre-conferenceSolr Black Belt Pre-conference
Solr Black Belt Pre-conference
 
NLP_Presentation
NLP_PresentationNLP_Presentation
NLP_Presentation
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and HadoopEventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
Eventual Consistency @WalmartLabs with Kafka, Avro, SolrCloud and Hadoop
 
Making powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysisMaking powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysis
 
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...
Driving Moore's Law with Python-Powered Machine Learning: An Insider's Perspe...
 

Recently uploaded

Unleash Data Power with EnFuse Solutions' Comprehensive Data Management Servi...
Unleash Data Power with EnFuse Solutions' Comprehensive Data Management Servi...Unleash Data Power with EnFuse Solutions' Comprehensive Data Management Servi...
Unleash Data Power with EnFuse Solutions' Comprehensive Data Management Servi...Rahul Bedi
 
A Brief Introduction About Jacob Badgett
A Brief Introduction About Jacob BadgettA Brief Introduction About Jacob Badgett
A Brief Introduction About Jacob BadgettJacobBadgett
 
MichaelStarkes_UncutGemsProjectSummary.pdf
MichaelStarkes_UncutGemsProjectSummary.pdfMichaelStarkes_UncutGemsProjectSummary.pdf
MichaelStarkes_UncutGemsProjectSummary.pdfmstarkes24
 
What is paper chromatography, principal, procedure,types, diagram, advantages...
What is paper chromatography, principal, procedure,types, diagram, advantages...What is paper chromatography, principal, procedure,types, diagram, advantages...
What is paper chromatography, principal, procedure,types, diagram, advantages...srcw2322l101
 
Series A Fundraising Guide (Investing Individuals Improving Our World) by Accion
Series A Fundraising Guide (Investing Individuals Improving Our World) by AccionSeries A Fundraising Guide (Investing Individuals Improving Our World) by Accion
Series A Fundraising Guide (Investing Individuals Improving Our World) by AccionAlejandro Cremades
 
tekAura | Desktop Procedure Template (2016)
tekAura | Desktop Procedure Template (2016)tekAura | Desktop Procedure Template (2016)
tekAura | Desktop Procedure Template (2016)Norah Medlin
 
Daftar Rumpun, Pohon, dan Cabang Ilmu (2024).pdf
Daftar Rumpun, Pohon, dan Cabang Ilmu (2024).pdfDaftar Rumpun, Pohon, dan Cabang Ilmu (2024).pdf
Daftar Rumpun, Pohon, dan Cabang Ilmu (2024).pdfAgusHalim9
 
Falcon Invoice Discounting Setup for Small Businesses
Falcon Invoice Discounting Setup for Small BusinessesFalcon Invoice Discounting Setup for Small Businesses
Falcon Invoice Discounting Setup for Small BusinessesFalcon investment
 
stock price prediction using machine learning
stock price prediction using machine learningstock price prediction using machine learning
stock price prediction using machine learninggauravwankar27
 
NewBase 17 May 2024 Energy News issue - 1725 by Khaled Al Awadi_compresse...
NewBase   17 May  2024  Energy News issue - 1725 by Khaled Al Awadi_compresse...NewBase   17 May  2024  Energy News issue - 1725 by Khaled Al Awadi_compresse...
NewBase 17 May 2024 Energy News issue - 1725 by Khaled Al Awadi_compresse...Khaled Al Awadi
 
بروفايل شركة ميار الخليج للاستشارات الهندسية.pdf
بروفايل شركة ميار الخليج للاستشارات الهندسية.pdfبروفايل شركة ميار الخليج للاستشارات الهندسية.pdf
بروفايل شركة ميار الخليج للاستشارات الهندسية.pdfomnme1
 
Innomantra Viewpoint - Building Moonshots : May-Jun 2024.pdf
Innomantra Viewpoint - Building Moonshots : May-Jun 2024.pdfInnomantra Viewpoint - Building Moonshots : May-Jun 2024.pdf
Innomantra Viewpoint - Building Moonshots : May-Jun 2024.pdfInnomantra
 
The Truth About Dinesh Bafna's Situation.pdf
The Truth About Dinesh Bafna's Situation.pdfThe Truth About Dinesh Bafna's Situation.pdf
The Truth About Dinesh Bafna's Situation.pdfMont Surfaces
 
HR and Employment law update: May 2024.
HR and Employment law update:  May 2024.HR and Employment law update:  May 2024.
HR and Employment law update: May 2024.FelixPerez547899
 
Equinox Gold Corporate Deck May 24th 2024
Equinox Gold Corporate Deck May 24th 2024Equinox Gold Corporate Deck May 24th 2024
Equinox Gold Corporate Deck May 24th 2024Equinox Gold Corp.
 
How Do Venture Capitalists Make Decisions?
How Do Venture Capitalists Make Decisions?How Do Venture Capitalists Make Decisions?
How Do Venture Capitalists Make Decisions?Alejandro Cremades
 
Making Sense of Tactile Indicators: A User-Friendly Guide
Making Sense of Tactile Indicators: A User-Friendly GuideMaking Sense of Tactile Indicators: A User-Friendly Guide
Making Sense of Tactile Indicators: A User-Friendly GuideEminent Tactiles
 
FEXLE- Salesforce Field Service Lightning
FEXLE- Salesforce Field Service LightningFEXLE- Salesforce Field Service Lightning
FEXLE- Salesforce Field Service LightningFEXLE
 
LinkedIn Masterclass Techweek 2024 v4.1.pptx
LinkedIn Masterclass Techweek 2024 v4.1.pptxLinkedIn Masterclass Techweek 2024 v4.1.pptx
LinkedIn Masterclass Techweek 2024 v4.1.pptxSymbio Agency Ltd
 
Chapter 2ppt Entrepreneurship freshman course.pptx
Chapter 2ppt Entrepreneurship freshman course.pptxChapter 2ppt Entrepreneurship freshman course.pptx
Chapter 2ppt Entrepreneurship freshman course.pptxtekalignpawulose09
 

Recently uploaded (20)

Unleash Data Power with EnFuse Solutions' Comprehensive Data Management Servi...
Unleash Data Power with EnFuse Solutions' Comprehensive Data Management Servi...Unleash Data Power with EnFuse Solutions' Comprehensive Data Management Servi...
Unleash Data Power with EnFuse Solutions' Comprehensive Data Management Servi...
 
A Brief Introduction About Jacob Badgett
A Brief Introduction About Jacob BadgettA Brief Introduction About Jacob Badgett
A Brief Introduction About Jacob Badgett
 
MichaelStarkes_UncutGemsProjectSummary.pdf
MichaelStarkes_UncutGemsProjectSummary.pdfMichaelStarkes_UncutGemsProjectSummary.pdf
MichaelStarkes_UncutGemsProjectSummary.pdf
 
What is paper chromatography, principal, procedure,types, diagram, advantages...
What is paper chromatography, principal, procedure,types, diagram, advantages...What is paper chromatography, principal, procedure,types, diagram, advantages...
What is paper chromatography, principal, procedure,types, diagram, advantages...
 
Series A Fundraising Guide (Investing Individuals Improving Our World) by Accion
Series A Fundraising Guide (Investing Individuals Improving Our World) by AccionSeries A Fundraising Guide (Investing Individuals Improving Our World) by Accion
Series A Fundraising Guide (Investing Individuals Improving Our World) by Accion
 
tekAura | Desktop Procedure Template (2016)
tekAura | Desktop Procedure Template (2016)tekAura | Desktop Procedure Template (2016)
tekAura | Desktop Procedure Template (2016)
 
Daftar Rumpun, Pohon, dan Cabang Ilmu (2024).pdf
Daftar Rumpun, Pohon, dan Cabang Ilmu (2024).pdfDaftar Rumpun, Pohon, dan Cabang Ilmu (2024).pdf
Daftar Rumpun, Pohon, dan Cabang Ilmu (2024).pdf
 
Falcon Invoice Discounting Setup for Small Businesses
Falcon Invoice Discounting Setup for Small BusinessesFalcon Invoice Discounting Setup for Small Businesses
Falcon Invoice Discounting Setup for Small Businesses
 
stock price prediction using machine learning
stock price prediction using machine learningstock price prediction using machine learning
stock price prediction using machine learning
 
NewBase 17 May 2024 Energy News issue - 1725 by Khaled Al Awadi_compresse...
NewBase   17 May  2024  Energy News issue - 1725 by Khaled Al Awadi_compresse...NewBase   17 May  2024  Energy News issue - 1725 by Khaled Al Awadi_compresse...
NewBase 17 May 2024 Energy News issue - 1725 by Khaled Al Awadi_compresse...
 
بروفايل شركة ميار الخليج للاستشارات الهندسية.pdf
بروفايل شركة ميار الخليج للاستشارات الهندسية.pdfبروفايل شركة ميار الخليج للاستشارات الهندسية.pdf
بروفايل شركة ميار الخليج للاستشارات الهندسية.pdf
 
Innomantra Viewpoint - Building Moonshots : May-Jun 2024.pdf
Innomantra Viewpoint - Building Moonshots : May-Jun 2024.pdfInnomantra Viewpoint - Building Moonshots : May-Jun 2024.pdf
Innomantra Viewpoint - Building Moonshots : May-Jun 2024.pdf
 
The Truth About Dinesh Bafna's Situation.pdf
The Truth About Dinesh Bafna's Situation.pdfThe Truth About Dinesh Bafna's Situation.pdf
The Truth About Dinesh Bafna's Situation.pdf
 
HR and Employment law update: May 2024.
HR and Employment law update:  May 2024.HR and Employment law update:  May 2024.
HR and Employment law update: May 2024.
 
Equinox Gold Corporate Deck May 24th 2024
Equinox Gold Corporate Deck May 24th 2024Equinox Gold Corporate Deck May 24th 2024
Equinox Gold Corporate Deck May 24th 2024
 
How Do Venture Capitalists Make Decisions?
How Do Venture Capitalists Make Decisions?How Do Venture Capitalists Make Decisions?
How Do Venture Capitalists Make Decisions?
 
Making Sense of Tactile Indicators: A User-Friendly Guide
Making Sense of Tactile Indicators: A User-Friendly GuideMaking Sense of Tactile Indicators: A User-Friendly Guide
Making Sense of Tactile Indicators: A User-Friendly Guide
 
FEXLE- Salesforce Field Service Lightning
FEXLE- Salesforce Field Service LightningFEXLE- Salesforce Field Service Lightning
FEXLE- Salesforce Field Service Lightning
 
LinkedIn Masterclass Techweek 2024 v4.1.pptx
LinkedIn Masterclass Techweek 2024 v4.1.pptxLinkedIn Masterclass Techweek 2024 v4.1.pptx
LinkedIn Masterclass Techweek 2024 v4.1.pptx
 
Chapter 2ppt Entrepreneurship freshman course.pptx
Chapter 2ppt Entrepreneurship freshman course.pptxChapter 2ppt Entrepreneurship freshman course.pptx
Chapter 2ppt Entrepreneurship freshman course.pptx
 

tmptmptmp123.pptx

  • 2. 0. Paper • GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval • Authors: Kexin Wang, Nandan Thakur, Nils Reimers, Iryna Gurevych • Published: 2021.12 (Arxiv) • https://arxiv.org/abs/2112.07577 2
  • 3. 0. Preliminaries • Information Retrieval • Query와 관련이 있는 문서를 찾는 작업 (관련이 있는 = 대답할 수 있는) • Open-domain QA: IR + MRC • Method: 쿼리와 가장 높은 Score(Similarity) 를 갖는 문서 선택 • Sparse embedding vs Dense embedding • Keyword/고유명사는 sparse, Synonym/Paraphrase는 dense 3
  • 4. 0. Preliminaries - 빠른 검색 (Maximum Inner Project Search) - 아쉬운 성능 - 좋은 성능 - 엄청 느림 Retriever -> Reranker -> Reader 4
  • 6. 1. Introduction • Recently, information retrieval methods based on dense vector spaces have become popular to address the limitation of sparse vector. • Dense retrieval methods require large amounts of training data to work well. • Dense retrieval methods are extremely sensitive to domain shifts. • Models trained on MS MARCO perform rather poorly for questions for COVID-19 scientific literatures. • Models did not learn how to represent this topic well in a vector space. • We present Generative Pseudo Labeling (GPL), an unsupervised domain adaptation for dense retrieval models. 6
  • 7. 2. Method • For a given target corpus, we generate for each passage three queries using T5-encoder-decoder model. • For each of the generated queries, we use an existing retrieval system to retrieve 50 negative passages. • For each (query, positive, negative) – tuple we compute the margin score using cross-encoder. • Train the bi-encoder with margin score. 7
  • 8. 2. Method • Multiple Negative Ranking loss considers only the coarse relationship between queries and passages., i.e. the matching passage is considered as relevant while all other passages are considered irrelevant. • However, the query generator might generate queries that are not answerable by the passage. Further, other passages might actually be relevant as well for a given query. • MarginMSE loss uses a powerful cross-encoder to soft-label (query, passage) pairs. It then teaches the dense retriever to mimic the score margin between the positive and negative query-passage pairs. In GPL, - Bad query -> low pos score -> distant - False negative -> high neg score -> similar MarginMSE Loss 8
  • 9. 3. Experiments • Query generator: docT5query • Negative miner(Retriever): msmarco-distilbert-base-v3, msmarco-MiniLM-L-6-v3 • 50 negatives using each retriever and uniformly sample • Cross encoder: msmarco-MiniLM-L-6-v2 • Student: MS MARCO DistilBERT + Mean pooling + Dot product • 140k training steps, 32 batch size (No need of large batch size!) Experimental Setup 9
  • 10. 3. Experiments • Six domain-specific text retrieval tasks from the BeIR benchmark • Evaluation is done using nDCG@10 • 더 관련있는 문서를 더 높은 순위로 예측하자! Evaluation • Zero-Shot • MS MARCO: distil-bert dense retrieval trained with MarginMSE • BM25: lexical matching from Elastic search • Pre-Training based Domain Adaptation • SimCSE: encode same sent with different dropout masks + MNRL loss • ICT: sample one sent from passage as the pseudo query • TSDAE: denoising autoencoder • Generation-based Domain Adaptation • Qgen: generated query + Multiple Negative Ranking loss Baselines 10
  • 12. 5. Analysis • GPL begins to be saturated after around 100K steps. • With TSDAE pre-training, the performance can be improved consistently. Influence of Training Steps Influence of Corpus Size • We find with more than 10K passages, GPL can already outperform the zero-shot baseline 12
  • 13. 5. Analysis • Generating 3 queries per passages appears to be optimal, generating more queries per passages does not yield further improvements. Robustness against Query Generation Sensitivity to Starting Checkpoints • We also evaluate to directly fine-tune a distilbert-model using QGen 13
  • 14. 6. Conclusion • In this work we propose GPL, a novel unsupervised domain adaptation method for dense retrieval models. • Pseudo-labeling overcomes two important shortcomings of previous methods. • Not all generated queries are of high quality • Training with mined hard negatives can be noised • We observe GPL performs well on all the datasets and significantly outperforms other approaches. • As a limitation, GPL requires a relatively complex training setup and future work can focus on simplify this training pipeline. 14

Editor's Notes

  1. MS MARCO로 학습한 bi-encoder가 BM25 보다 성능이 안좋음 Cross encoder에서도 BM25 retriever가 MS MARCO retriever보다 좋음 Pretraining + domain adaptation에서는 TSDAE가 가장 좋음 그 외에서는 GPL이 제일 좋음 Distilbert에 TSDAE 학습 후 GPL 학습하면 더 좋음 Reranking 더 좋음