Dual Embedding Space Model (DESM)

•Descargar como PPTX, PDF•

0 recomendaciones•784 vistas

A fundamental goal of search engines is to identify, given a query, documents that have relevant text. This is intrinsically difficult because the query and the document may use different vocabulary, or the document may contain query words without being relevant. We investigate neural word embeddings as a source of evidence in document ranking. We train a word2vec embedding model on a large unlabelled query corpus, but in contrast to how the model is commonly used, we retain both the input and the output projections, allowing us to leverage both the embedding spaces to derive richer distributional relationships. During ranking we map the query words into the input space and the document words into the output space, and compute a query-document relevance score by aggregating the cosine similarities across all the query-document word pairs. We postulate that the proposed Dual Embedding Space Model (DESM) captures evidence on whether a document is about a query term in addition to what is modelled by traditional term-frequency based approaches. Our experiments show that the DESM can re-rank top documents returned by a commercial Web search engine, like Bing, better than a term-matching based signal like TF-IDF. However, when ranking a larger set of candidate documents, we find the embeddings-based approach is prone to false positives, retrieving documents that are only loosely related to the query. We demonstrate that this problem can be solved effectively by ranking based on a linear mixture of the DESM and the word counting features.

Tecnología

Dual Embedding Space Model (DESM)
Bhaskar Mitra, Eric Nalisnick, Nick Craswell and Rich Caruana
https://arxiv.org/abs/1602.01137

How do you learn a neural embedding?
Setup a prediction task
Source Item → Target Item
(The bottleneck layers are crucial for generalization)
Target
item
(sparse)
Source
item
(sparse)
Source
embedding
(dense)
Target
Embedding
(dense)
Distance
Metric
The bottleneck
Word2vec
Mikolov et. al. (2013)
Word → Neighboring word
I/O: One-Hot
DSSM (Query-Document)
Huang et. al. (2013), Shen et. al. (2014)
Query → Document
I/O: Bag-of-trigrams
DSSM (Session Pairs)
Mitra (2015)
Query → Neighboring query in session
I/O: Bag-of-trigrams
DSSM (Language Model)
Mitra and Craswell (2015)
Query prefix → query suffix
I/O: Bag-of-trigrams

Not all embeddings are created equal
The source-target training pairs strictly dictate what notion of
relatedness will be modelled in the embedding space
Is eminem more similar to rihanna or rap?
Is yale more similar to harvard or alumni?
Is seahawks more similar to broncos or seattle?
(Be careful of using pre-trained embeddings as inputs to a different model –
one-hot representations or learning an in situ embedding may be better!)

Word2vec
Learning word embeddings based
on word co-occurrence data.
Well-known for word analogy tasks,
[king] – [man] + [woman] ≈ [queen]
What if I told you that everyone
who uses Word2vec is throwing half
the model away?

Typical vs. Topical Relatedness
The IN-IN and the OUT-OUT similarities cluster words that occur in the same context
and therefore of the same Type. The overall word2vec model is trained to predict
neighboring words. Therefore the IN-OUT similarity clusters words that commonly co-
occur under the same Topic.

Typical embeddings for Web search?
B. Mitra and N. Craswell. Query
auto-completion for rare prefixes.
In Proc. CIKM. ACM, 2015.

Which passage is about Albuquerque?
Traditionally in Search we look for evidence of
relevance of a document to a query in terms
of the number of matches of the query
terms in the document.
But there is useful signal in the non-matching
terms in the document about whether the
document is really about the query terms, or
simply mentions them.
A word co-occurrence model can be used to
check if the other words in the document
support the presence of the matching terms.
Passage about Albuquerque
Passage not about Albuquerque

Dual Embedding Space Model
• All pairs comparison between query
and document terms
• Document embedding can be pre-
computed as the centroid of all the
unit vectors of the words in the
document
• DESMIN-OUT uses IN-embeddings for
query words and OUT-embeddings
for document words
• DESMIN-IN uses IN-embeddings
document words as well

Because Cambridge is not an African mammal
DESM = ✔
BM25 = ✔
DESM = ✘
BM25 = ✔
DESM = ✔
BM25 = ✘
Query: cambridge

Telescoping Evaluation
As a weak ranking feature DESMIN-OUT performs better than BM25,
LSA and DESMIN-IN models on a UHRS (Overall) set and a click based
test set.

Full retrieval evaluation
The DESM models only a specific aspect of document relevance. In the presence
of many random documents (distractors) it is susceptible to spurious false
positives and needs to be combined with lexical ranking features such as BM25

Más contenido relacionado

La actualidad más candente

Word embeddings, RNN, GRU and LSTMDivya Gera

Signature filesDeepali Raikar

Text clusteringKU Leuven

NLPJeet Das

Natural language processingNational Institute of Technology Durgapur

Deep Learning for Natural Language Processing: Word EmbeddingsRoelof Pieters

Word embedding ShivaniChoudhary74

Neural Models for Information RetrievalBhaskar Mitra

Recurrent Neural Networks, LSTM and GRUananth

Information ExtractionRubén Izquierdo Beviá

Text Data MiningKU Leuven

Neural Text Embeddings for Information Retrieval (WSDM 2017)Bhaskar Mitra

Using Text Embeddings for Information RetrievalBhaskar Mitra

Natural Language ProcessingVarunjeet Singh Rekhi

Natural Language processing Parts of speech tagging, its classes, and how to ...Rajnish Raj

Word2 vecankit_ppt

INTRODUCTION TO NLP, RNN, LSTM, GRUSri Geetha

Clustering for Stream and Parallelism (DATA ANALYTICS)DheerajPachauri

Word Embeddings - IntroductionChristian Perone

Text similarity measuresankit_ppt

La actualidad más candente (20)

Word embeddings, RNN, GRU and LSTM

Signature files

Text clustering

NLP

Natural language processing

Deep Learning for Natural Language Processing: Word Embeddings

Word embedding

Neural Models for Information Retrieval

Recurrent Neural Networks, LSTM and GRU

Information Extraction

Text Data Mining

Neural Text Embeddings for Information Retrieval (WSDM 2017)

Using Text Embeddings for Information Retrieval

Natural Language Processing

Natural Language processing Parts of speech tagging, its classes, and how to ...

Word2 vec

INTRODUCTION TO NLP, RNN, LSTM, GRU

Clustering for Stream and Parallelism (DATA ANALYTICS)

Word Embeddings - Introduction

Text similarity measures

Similar a Dual Embedding Space Model (DESM)

The Duet modelBhaskar Mitra

5 Lessons Learned from Designing Neural Models for Information RetrievalBhaskar Mitra

Vectorland: Brief Notes from Using Text Embeddings for SearchBhaskar Mitra

Using topic modelling frameworks for NLP and semantic searchDawn Anderson MSc DigM

Document Classification Using KNN with Fuzzy Bags of Word Representationsuthi

Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease

THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIESkevig

EDI 2009- Advanced Search: What’s Under the Hood of your Favorite Search System?Georgetown University Law Center Office of Continuing Legal Education

Designing, Visualizing and Understanding Deep Neural Networksconnectbeubax

6&7-Query Languages & Operations.pptBereketAraya

Deep Neural Methods for RetrievalBhaskar Mitra

Automated Software Requirements LabelingData Works MD

Eurolan 2005 PedersenUniversity of Minnesota, Duluth

Topic detecton by clustering and text miningIRJET Journal

Vectorization In NLP.pptxChode Amarnath

A Novel Approach for Keyword extraction in learning objects using text miningIJSRD

Cc35451454IJERA Editor

Natural Language ProcessingNimrita Koul

AN EFFICIENT APPROACH TO IMPROVE ARABIC DOCUMENTS CLUSTERING BASED ON A NEW K...cscpconf

Similar a Dual Embedding Space Model (DESM) (20)

The Duet model

5 Lessons Learned from Designing Neural Models for Information Retrieval

Vectorland: Brief Notes from Using Text Embeddings for Search

Using topic modelling frameworks for NLP and semantic search

Document Classification Using KNN with Fuzzy Bags of Word Representation

Deep Learning for Information Retrieval: Models, Progress, & Opportunities

THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES

EDI 2009- Advanced Search: What’s Under the Hood of your Favorite Search System?

Designing, Visualizing and Understanding Deep Neural Networks

6&7-Query Languages & Operations.ppt

Deep Neural Methods for Retrieval

Automated Software Requirements Labeling

Eurolan 2005 Pedersen

Topic detecton by clustering and text mining

Vectorization In NLP.pptx

A Novel Approach for Keyword extraction in learning objects using text mining

Cc35451454

Natural Language Processing

AN EFFICIENT APPROACH TO IMPROVE ARABIC DOCUMENTS CLUSTERING BASED ON A NEW K...

Más de Bhaskar Mitra

Joint Multisided Exposure Fairness for Search and RecommendationBhaskar Mitra

What’s next for deep learning for Search?Bhaskar Mitra

So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...Bhaskar Mitra

Efficient Machine Learning and Machine Learning for Efficiency in Information...Bhaskar Mitra

Multisided Exposure Fairness for Search and RecommendationBhaskar Mitra

Neural Learning to RankBhaskar Mitra

Neural Information Retrieval: In search of meaningful progressBhaskar Mitra

Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning TrackBhaskar Mitra

Neural Learning to RankBhaskar Mitra

Duet @ TREC 2019 Deep Learning TrackBhaskar Mitra

Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and BeyondBhaskar Mitra

Neural Learning to RankBhaskar Mitra

Learning to Rank with Neural NetworksBhaskar Mitra

Deep Learning for SearchBhaskar Mitra

Neural Learning to RankBhaskar Mitra

Deep Learning for SearchBhaskar Mitra

Adversarial and reinforcement learning-based approaches to information retrievalBhaskar Mitra

A Simple Introduction to Neural Information RetrievalBhaskar Mitra

Neural Models for Document RankingBhaskar Mitra

Más de Bhaskar Mitra (20)

Joint Multisided Exposure Fairness for Search and Recommendation

What’s next for deep learning for Search?

So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...

Efficient Machine Learning and Machine Learning for Efficiency in Information...

Multisided Exposure Fairness for Search and Recommendation

Neural Learning to Rank

Neural Information Retrieval: In search of meaningful progress

Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track

Neural Learning to Rank

Duet @ TREC 2019 Deep Learning Track

Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond

Neural Learning to Rank

Learning to Rank with Neural Networks

Deep Learning for Search

Neural Learning to Rank

Deep Learning for Search

Adversarial and reinforcement learning-based approaches to information retrieval

A Simple Introduction to Neural Information Retrieval

Neural Models for Document Ranking

Último

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Artificial Intelligence: Facts and MythsJoaquim Jorge

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

A Domino Admins Adventures (Engage 2024)Gabriella Davis

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

GenAI Risks & Security Meetup 01052024.pdflior mazor

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Evaluating the top large language models.pdfChristopherTHyatt

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

🐬 The future of MySQL is Postgres 🐘RTylerCroy

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

Dual Embedding Space Model (DESM)

1. Dual Embedding Space Model (DESM) Bhaskar Mitra, Eric Nalisnick, Nick Craswell and Rich Caruana https://arxiv.org/abs/1602.01137

2. How do you learn a neural embedding? Setup a prediction task Source Item → Target Item (The bottleneck layers are crucial for generalization) Target item (sparse) Source item (sparse) Source embedding (dense) Target Embedding (dense) Distance Metric The bottleneck Word2vec Mikolov et. al. (2013) Word → Neighboring word I/O: One-Hot DSSM (Query-Document) Huang et. al. (2013), Shen et. al. (2014) Query → Document I/O: Bag-of-trigrams DSSM (Session Pairs) Mitra (2015) Query → Neighboring query in session I/O: Bag-of-trigrams DSSM (Language Model) Mitra and Craswell (2015) Query prefix → query suffix I/O: Bag-of-trigrams

3. Not all embeddings are created equal The source-target training pairs strictly dictate what notion of relatedness will be modelled in the embedding space Is eminem more similar to rihanna or rap? Is yale more similar to harvard or alumni? Is seahawks more similar to broncos or seattle? (Be careful of using pre-trained embeddings as inputs to a different model – one-hot representations or learning an in situ embedding may be better!)

4. Word2vec Learning word embeddings based on word co-occurrence data. Well-known for word analogy tasks, [king] – [man] + [woman] ≈ [queen] What if I told you that everyone who uses Word2vec is throwing half the model away?

5. Typical vs. Topical Relatedness The IN-IN and the OUT-OUT similarities cluster words that occur in the same context and therefore of the same Type. The overall word2vec model is trained to predict neighboring words. Therefore the IN-OUT similarity clusters words that commonly co- occur under the same Topic.

6. Typical embeddings for Web search? B. Mitra and N. Craswell. Query auto-completion for rare prefixes. In Proc. CIKM. ACM, 2015.

7. Which passage is about Albuquerque? Traditionally in Search we look for evidence of relevance of a document to a query in terms of the number of matches of the query terms in the document. But there is useful signal in the non-matching terms in the document about whether the document is really about the query terms, or simply mentions them. A word co-occurrence model can be used to check if the other words in the document support the presence of the matching terms. Passage about Albuquerque Passage not about Albuquerque

8. Dual Embedding Space Model • All pairs comparison between query and document terms • Document embedding can be pre- computed as the centroid of all the unit vectors of the words in the document • DESMIN-OUT uses IN-embeddings for query words and OUT-embeddings for document words • DESMIN-IN uses IN-embeddings document words as well

9. IN-OUT vs. IN-IN

10. Because Cambridge is not an African mammal DESM = ✔ BM25 = ✔ DESM = ✘ BM25 = ✔ DESM = ✔ BM25 = ✘ Query: cambridge

11. Telescoping Evaluation As a weak ranking feature DESMIN-OUT performs better than BM25, LSA and DESMIN-IN models on a UHRS (Overall) set and a click based test set.

12. Full retrieval evaluation The DESM models only a specific aspect of document relevance. In the presence of many random documents (distractors) it is susceptible to spurious false positives and needs to be combined with lexical ranking features such as BM25

13. DESM vs. BM25

14. Making different mistakes

15. Questions?

Dual Embedding Space Model (DESM)

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Dual Embedding Space Model (DESM)

Similar a Dual Embedding Space Model (DESM) (20)

Más de Bhaskar Mitra

Más de Bhaskar Mitra (20)

Último

Último (20)

Dual Embedding Space Model (DESM)