SlideShare una empresa de Scribd logo
1 de 44
Natural Language Inference
Viral Gupta
Agenda

Describe the Problem. ( Terminology and
dataset)

Describe framework for the solution

Various Models

Results

Demo

Tensorflow
Problem Description

NLI an important problem in NLP. Summarization,
information retrieval, and question answering
rely on high-quality NLI.

Problem Set-up:

SNLI dataset is a collection of about half a million
natural language inference (NLI) problems.

Given sentence A( premise), sentence B
(hypothesis) classify the relationship

Classify relationship entailment, contradiction, or
neutral.
Problem Description

Training dataset 550,000 sentences

Test 10000 sentences

Dev 10000 sentences

Vocabulary 33,000

Original paper from NIPS
(http://stanford.edu/~angeli/papers/2015-
emnlp-snli.pdf)
Few Examples
SNLI Dataset
Few More Examples

Premise: a statue at a museum that no seems to be looking at .

Hypothesis: there is a statue that not many people seem to be
interested in .

Entailment

a couple walk hand in hand down a street .

the couple is married .

neutral
Word Embedding
Premise Sentence
Model
300d
tanh layer
Word Embedding Word Embedding
Hypothesis
Sentence Model
tanh layerW1 tanh
layer
tanh layertanh layertanh layerW2 tanh layer
W3 tanh layer
Cross Entropy Loss
Softmax
100d 100d
300d
Glove
CBOW
Skip-Gram
Bag of words
LSTM, GRU
CNN
200d
200d
200d
Dropout
L2 Regularization
Model Architecture
Distributed Word Representation
Google Images
Distributed Word Representation

Word embedding is the collective name for a set of
language modeling and feature learning techniques in
natural language processing (NLP) where words or
phrases from the vocabulary are mapped to vectors of
real numbers in a low-dimensional space relative to the
vocabulary size ("continuous space").

Different Algorithms , Similar Ideas

CBOW

Skip-gram

Glove
Distributed Word Representation
Google Images
Distributed Word Representation
Google Images
Distributed Word Representation
Google Images
Skip-Gram
Google Images
Skip-gram Equations
Skip-gram Psuedocode
Distributed word representations are good.
Center wordTarget word
def negSamplingGradient(center, neighbour, outputVectors, dataset,
K=10):
word_samples = np.zeros((K, word_dim))
In a loop generate random samples
word_samples[i, :] = outputVectors[sample, :]
u_o = outputVectors[neighbour, :]
## compute derivative with respect to v_c word
## first part ( 1 - sigmoid( u_o.T, v_c))*u_o
cost_tmp= sigmoid(np.dot(u_o, center))[0]
grad_part_1 = -(1 - cost_tmp) * u_o
# second part of gradient
#with respect to each negative sample the
# gradient has the form ( 1 - sigmoid( u_j.T, v_c))*u_j
cost_tmp = sigmoid( np.dot( -word_samples, center))
tmp_2 = (1 - cost_tmp) * word_samples
grad_part_2 = np.sum( tmp_2, axis= 0 )
grad_vc = grad_part_1 + grad_part_2
Bag of Words Approach

Premise-Sentence : Simply add the word
vectors for all the words in the Sentence.

Do the same for Hypothesis Sentence.

Now we have feature vectors for each
sentence.

Train the Neural Network. Not really deep
network.
Using LSTM to Extract Feature
vector
Google Images
Vanilla RNN
Cs231n Slides
Vanilla RNN
CS231n Slides
LSTM
CNN
• Go over Convolution operation
• Go over Max-pooling operation
• How to apply these to NLP
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Max-Pooling
Applying CNN to Sentences
Filters Activation Map Feature Set
Wildml Blog
Results
Results

Since we have randomly initialization of the weight matrices the
start accuracy of almost all models is 33%( 3 classes).

When applying high learning rate and high
dropout_keep_probability, very high gradients flow in the
network and so the loss function keeps swinging.

Bag of words without word vector update gives accuracy about
62-64%.

Bag of words with word vector update gives accuracy about
70%.

CNN accuracy is about 75%

LSTM accuracy 78-79%. State of the art 86% (neural attention
models)
Hyper-parameters

the network dimensionality

optimization Routine (Adam, Stochastic Gradient Descent)

Learning rate, Regularization parameters(L2, Dropout)

Learning Rate: Determines how much parameter changes in
each epoch. Larger values cause parameters to swing leading
to sub-optimal results.

Dropout : Probability measure on whether the individual neurons
will fire or shutoff. Prevents over-fitting by randomly keeping
away information. What this does is allows the network to learn
the data patterns without
Demo
TensorFlow
TensorFlow
TensorFlow

TensorFlow is a deep learning library recently
open-sourced by Google.

TensorFlow provides primitives for defining
functions on tensors and automatically computing
their derivatives.

It has various utilities to use for solving Stochastic
Minimization problems.

Similar to Numpy. Infact numpy arrays can be
converted into tensors. ( This isn’t most efficient
approach)
TensorFlow
CS224d Slides
TensorFlow
CS224d Slides
TensorFlow
CS224d Slides
TensorFlow
CS224d Slides
Questions
Thanks!

Más contenido relacionado

La actualidad más candente

A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsBhaskar Mitra
 
Modeling documents with Generative Adversarial Networks - John Glover
Modeling documents with Generative Adversarial Networks - John GloverModeling documents with Generative Adversarial Networks - John Glover
Modeling documents with Generative Adversarial Networks - John GloverSebastian Ruder
 
Word representations in vector space
Word representations in vector spaceWord representations in vector space
Word representations in vector spaceAbdullah Khan Zehady
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information RetrievalBhaskar Mitra
 
(Kpi summer school 2015) word embeddings and neural language modeling
(Kpi summer school 2015) word embeddings and neural language modeling(Kpi summer school 2015) word embeddings and neural language modeling
(Kpi summer school 2015) word embeddings and neural language modelingSerhii Havrylov
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Bhaskar Mitra
 
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalUsing Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalBhaskar Mitra
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchBhaskar Mitra
 
Representation Learning of Vectors of Words and Phrases
Representation Learning of Vectors of Words and PhrasesRepresentation Learning of Vectors of Words and Phrases
Representation Learning of Vectors of Words and PhrasesFelipe Moraes
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document RankingBhaskar Mitra
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information RetrievalBhaskar Mitra
 
Word2vec slide(lab seminar)
Word2vec slide(lab seminar)Word2vec slide(lab seminar)
Word2vec slide(lab seminar)Jinpyo Lee
 
GDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastText
GDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastTextGDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastText
GDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastTextrudolf eremyan
 
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning TrackConformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning TrackBhaskar Mitra
 
Neural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressNeural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressBhaskar Mitra
 
(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结君 廖
 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptbutest
 

La actualidad más candente (20)

A Simple Introduction to Word Embeddings
A Simple Introduction to Word EmbeddingsA Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
 
Modeling documents with Generative Adversarial Networks - John Glover
Modeling documents with Generative Adversarial Networks - John GloverModeling documents with Generative Adversarial Networks - John Glover
Modeling documents with Generative Adversarial Networks - John Glover
 
Word representations in vector space
Word representations in vector spaceWord representations in vector space
Word representations in vector space
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
What is word2vec?
What is word2vec?What is word2vec?
What is word2vec?
 
(Kpi summer school 2015) word embeddings and neural language modeling
(Kpi summer school 2015) word embeddings and neural language modeling(Kpi summer school 2015) word embeddings and neural language modeling
(Kpi summer school 2015) word embeddings and neural language modeling
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information RetrievalUsing Text Embeddings for Information Retrieval
Using Text Embeddings for Information Retrieval
 
Vectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for SearchVectorland: Brief Notes from Using Text Embeddings for Search
Vectorland: Brief Notes from Using Text Embeddings for Search
 
Representation Learning of Vectors of Words and Phrases
Representation Learning of Vectors of Words and PhrasesRepresentation Learning of Vectors of Words and Phrases
Representation Learning of Vectors of Words and Phrases
 
Thai Word Embedding with Tensorflow
Thai Word Embedding with Tensorflow Thai Word Embedding with Tensorflow
Thai Word Embedding with Tensorflow
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document Ranking
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
Word2vec slide(lab seminar)
Word2vec slide(lab seminar)Word2vec slide(lab seminar)
Word2vec slide(lab seminar)
 
GDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastText
GDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastTextGDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastText
GDG Tbilisi 2017. Word Embedding Libraries Overview: Word2Vec and fastText
 
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning TrackConformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
 
Neural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progressNeural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progress
 
(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结(Deep) Neural Networks在 NLP 和 Text Mining 总结
(Deep) Neural Networks在 NLP 和 Text Mining 总结
 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.ppt
 
Deep learning for nlp
Deep learning for nlpDeep learning for nlp
Deep learning for nlp
 

Similar a SNLI_presentation_2

Lecture 6
Lecture 6Lecture 6
Lecture 6hunglq
 
[Paper Reading] Unsupervised Learning of Sentence Embeddings using Compositi...
[Paper Reading]  Unsupervised Learning of Sentence Embeddings using Compositi...[Paper Reading]  Unsupervised Learning of Sentence Embeddings using Compositi...
[Paper Reading] Unsupervised Learning of Sentence Embeddings using Compositi...Hiroki Shimanaka
 
Deep Generative Models
Deep Generative Models Deep Generative Models
Deep Generative Models Chia-Wen Cheng
 
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...ijtsrd
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics Ibutest
 
Moore_slides.ppt
Moore_slides.pptMoore_slides.ppt
Moore_slides.pptbutest
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOswald Campesato
 
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowOswald Campesato
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Oswald Campesato
 
Grammarly Meetup: Paraphrase Detection in NLP (PART 2) - Andriy Gryshchuk
Grammarly Meetup: Paraphrase Detection in NLP (PART 2) - Andriy GryshchukGrammarly Meetup: Paraphrase Detection in NLP (PART 2) - Andriy Gryshchuk
Grammarly Meetup: Paraphrase Detection in NLP (PART 2) - Andriy GryshchukGrammarly
 
MACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMassimo Schenone
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptxGowrySailaja
 
Deep Learning and Watson Studio
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson StudioSasha Lazarevic
 
D3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningD3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningOswald Campesato
 
NLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic ClassificationNLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic ClassificationEugene Nho
 
D3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningD3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningOswald Campesato
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer modelsDing Li
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.pptyang947066
 

Similar a SNLI_presentation_2 (20)

Lecture 6
Lecture 6Lecture 6
Lecture 6
 
Android and Deep Learning
Android and Deep LearningAndroid and Deep Learning
Android and Deep Learning
 
[Paper Reading] Unsupervised Learning of Sentence Embeddings using Compositi...
[Paper Reading]  Unsupervised Learning of Sentence Embeddings using Compositi...[Paper Reading]  Unsupervised Learning of Sentence Embeddings using Compositi...
[Paper Reading] Unsupervised Learning of Sentence Embeddings using Compositi...
 
Deep Generative Models
Deep Generative Models Deep Generative Models
Deep Generative Models
 
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
 
CMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics ICMSC 723: Computational Linguistics I
CMSC 723: Computational Linguistics I
 
Moore_slides.ppt
Moore_slides.pptMoore_slides.ppt
Moore_slides.ppt
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Analyse de sentiment et classification par approche neuronale en Python et Weka
Analyse de sentiment et classification par approche neuronale en Python et WekaAnalyse de sentiment et classification par approche neuronale en Python et Weka
Analyse de sentiment et classification par approche neuronale en Python et Weka
 
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and Tensorflow
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
 
Grammarly Meetup: Paraphrase Detection in NLP (PART 2) - Andriy Gryshchuk
Grammarly Meetup: Paraphrase Detection in NLP (PART 2) - Andriy GryshchukGrammarly Meetup: Paraphrase Detection in NLP (PART 2) - Andriy Gryshchuk
Grammarly Meetup: Paraphrase Detection in NLP (PART 2) - Andriy Gryshchuk
 
MACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSISMACHINE-DRIVEN TEXT ANALYSIS
MACHINE-DRIVEN TEXT ANALYSIS
 
Word_Embeddings.pptx
Word_Embeddings.pptxWord_Embeddings.pptx
Word_Embeddings.pptx
 
Deep Learning and Watson Studio
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson Studio
 
D3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningD3, TypeScript, and Deep Learning
D3, TypeScript, and Deep Learning
 
NLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic ClassificationNLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic Classification
 
D3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningD3, TypeScript, and Deep Learning
D3, TypeScript, and Deep Learning
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
 

SNLI_presentation_2