SlideShare una empresa de Scribd logo
1 de 30
Descargar para leer sin conexión
Engineering
Intelligent NLP
Applications Using
Deep Learning –
Part 2
Saurabh Kaushik
• Part 1:
• Why NLP?
• What is NLP?
• What is the Word & Sentence
Modelling in NLP?
• What is Word Representation in
NLP?
• What is Language Processing in
NLP?
Agenda
• PART 2 :
• WHY DL FOR NLP?
• WHAT IS DL?
• WHAT IS DL FOR NLP?
• HOW RNN WORKS FOR NLP?
• HOW CNN WORKS FOR NLP?
WHY DL FOR NLP?
Why DL for NLP?
• The majority of
traditional, rule based
natural language
processing procedures
represent words was
“One-Hot” encoded
vectors.
Words as
“One-Hot”
Vectors
• A lot value in NLP comes
from understanding a
word in relation to its
neighbors and their
syntactical relationship.
Lack of
Lexical
Semantics
• Bag of Words models,
including TF-IDF models,
cannot distinguish certain
contexts
Problems
with Bag
of Words
• Two different words will have no interaction
between them
• “One Hot” will compute enormously long vectors
for large corpus.
• Traditional models largely focus on syntactic level
representations instead of semantic level
representations
• Sentiment analysis can be easy for longer corpus.
• However, for dataset of single sentence movie
reviews (Pang and Lee, 2005) accuracy never
reached above 80% for >7 years
• if wi.form == ‘John’:
• wi.pos = ‘noun’
• if wi.form == ‘majors’:
• wi.pos = ‘noun’
• if wi.form == ‘majors’ and wi-1.form ==
‘two’
• wi.pos = ‘noun’
• if wi.form == ‘studies’ and wi-1.pos ==
‘num’
• wi.pos = ‘noun’
What is Rule Based Approach?
Find the part-of-speech tag of each word.
Good
Really
Too Specific
Keep Doing this
Img: Jinho D. Choi – Machine Learning in NLP PPT
• Algorithm:
1. Gather as much LABELED data as you
can get
2. Throw some algorithms at it (mainly put in
an SVM and keep it at that)
3. If you actually have tried more algos: Pick
the best
4. Spend hours hand engineering some
features / feature selection / dimensionality
reduction (PCA, SVD, etc)
5. Repeat…
What is Machine Learning Approach?
Machine Learning – Algo & Arch
Feature Engineering:
• Hand crafting features for given text. Process also called, Feature Engineering.
• Feature Engineering: Functions which transform input (raw) data into a feature
space.
• Discriminative – for decision boundary
• NLP tasks often deal with 1 ~ 10
million features.
• These feature vectors are very
sparse.
• The values in these vectors are often
binary.
• Many features are redundant in some
way.
• Feature selection takes a long time.
• Is machine learning easier or harder
for NLP?
What is Machine Learning Approach?
Extract features for each word.
Convert string features into vector.
Img: Jinho D. Choi – Machine Learning in NLP PPT
• Machine Learning is about:
• Features: In ML, feature engineering is explicit
process and mostly manual (programmatically). It is
painful, over-specified and often incomplete. And
take longer time to design and validate.
• Representation: ML has specific framework for
Word Representation based on problem set and its
algo.
• Learning: It is mostly Supervised Learning.
Why DL vs ML?
• DEEP LEARNING IS ABOUT:
• FEATURES : IDENTIFY & LEARN FEATURES AUTOMATICALLY.
LEARNED FEATURES ARE EASY TO ADAPT AND FAST TO
LEARN.
• REPRESENTATION: DL PROVIDES A VERY FLEXIBLE UNIVERSAL,
(ALMOST) LEARNABLE FRAMEWORK FOR REPRESENTING
WORDS, VISUAL AND LINGUISTIC INFORMATION.
• LEARNING: DL LEARN FROM BOTH SUPERVISED (FROM RAW
TEXT, IMAGE, AUDIO CONTENTS) AND UNSUPERVISED
(SENTIMENTAL, POS TAGGED) DATA.
NER POS
WordNet WordNet
How Classical different from Deep Learning for NLP?
• Learning Representation
• Handcrafted feature time consuming. Incomplete and
Over Specification
• Need to be done from each specific domain data.
• Need for Distributional Similarity & Distributed
Representation
• Current NLP systems are incredibly fragile because of
their atomic symbol representation.
• Unsupervised features and Weight Training
• Most NLP & ML tech requires labelled data
(supervised learning).
• Learning multiple levels of representation
• Successive model layers learn deeper intermediate
representations
• Language is composed of words and phrases. Need
Compositionality in ML Models.
• Recursion: the same operator (word feature) is
applied repeatedly on different component (words in
sentences).
• Why Now?
• New methods of supervised pre-training
• More efficient Parameter Estimation
• Better understanding of Parameter Regularization
What are other major Reason for Exploring DL for NLP?
Where can Dl be applied for NLP tasks?
DL Algorithms NLP Usage
Neural Network (NN) - Feed forward • POS, NER, Chunking
• Entity and Intent Extraction
Recurrent Neural Networks (RNN) • Language Modeling and Generating Text
• Machine Translation
• Question Answering System
• Image Captioning - Generating Image Descriptions
Recursive Neural Networks • Parsing Sentences
• Sentiment Analysis
• Paraphrase Detection
• Relation Classification
• Object Detection
Convolutional Neural Network (CNN) • Sentence / Text Classification
• Relation Extraction and Classification
• Sentiment classification; Spam Detection or Topic
Categorization.
• Classification of Search Queries
• Semantic relation extraction
WHAT IS DL?
• In Human Neuron:
• A neuron: many-inputs / one-output unit
• Output can be excited or not excited
• Incoming signals from other neurons determine if the
neuron shall excite ("fire")
• Output subject to attenuation in the synapses, which are
junction parts of the neuron
What is Neural Network?
• IN COMPUTER NEURON:
1. TAKES THE INPUTS .
2. CALCULATE THE SUMMATION OF THE INPUTS .
3. COMPARE IT WITH THE THRESHOLD BEING SET DURING THE
LEARNING STAGE.
• Artificial Neural Network are designed to solve any problem by trying to mimic the structure and the function of
our nervous system.
• Neural Network are based on simulated neurons, which are joined together in a variety of ways to form a
network.
• Neural Network resembles human brain following two ways.
• NN acquires Knowledge through Learning
• This Knowledge is stored in Interconnection strength, called Synaptic Weight.
• In Logistic Regression based NN,
• X : Input parameter at each node
• B : Bias parameter at each node
• W : Weight at each node
• H(x) : Output function at each node
• A : Activation Function at each node
What is Neural Network?
• NEURON - LOGISTIC REGRESSION OR SIMILAR FUNCTION
• BIAS UNIT – INTERCEPT TERM/ALWAYS ON FEATURE
• ACTIVATION FUNCTION – LOGISTIC RESPONSE (SIGMOID FOR NON-
LINEARITY)
• FEED FORWARD - RUNNING STOCHASTIC GRADIENT ASCENDS
FORWARD LAYER BY LAYER
• BACKPROPAGATION – RUNNING STOCHASTIC GRADIENT DESCENDS
BACKWARD LAYER BY LAYER
• WEIGHT DECAY – REGULARIZATION / BAYESIAN PRIOR
Multi Layer Neural
Network
Neuron Node
Compute Function
Neural Node
Components
Single Layer Neural
Network
• First model developed using NN was meant to show
the advantage of using distributed representations to
beat state-of-the-art statistical language models
(smoothed n-grams).
• Done in 2003, this NN consists of a one-hidden layer
feed-forward neural network that predicts the next word
in a sequence. It is called Neural Probabilistic
Language Model .
• Output of Model: f(wt,wt−1,⋯,wt−n+1)
• Probability: p(wt|wt−1,⋯,wt−n+1)
• The general building blocks of their model, however,
are still found in all current neural language and word
embedding models. These are:
• Embedding Layer: a layer that generates word
embeddings by multiplying an index vector with a
word embedding matrix;
• Intermediate Layer(s): one or more layers that
produce an intermediate representation of the input,
e.g. a fully-connected layer that applies a non-
linearity to the concatenation of word embeddings of
n previous words;
• Softmax Layer: the final layer that produces a
probability distribution over words in V.
How NN can be used in NLP?
Ref: https://www.iro.umontreal.ca/~bengioy/yoshua_en/research.html
Classic neural language model (Bengio et al., 2003)
• CBOW (Common Bag of Words):
• The input to the model could be wi−2,wi−1,wi+1,wi+2, the
preceding and following words of the current word we are at.
The output of the neural network will be wi. Hence you can
think of the task as "predicting the word given its context"
• Note that the number of words we use depends on your setting
for the window size.
How to get Syntactical and Semantic Relationship using DL?
• SKIP-GRAM:
• THE INPUT TO THE MODEL IS WI, AND THE OUTPUT COULD BE
WI−1,WI−2,WI+1,WI+2. SO THE TASK HERE IS "PREDICTING THE
CONTEXT GIVEN A WORD". ALSO, THE CONTEXT IS NOT LIMITED TO
ITS IMMEDIATE CONTEXT, TRAINING INSTANCES CAN BE CREATED BY
SKIPPING A CONSTANT NUMBER OF WORDS IN ITS CONTEXT, SO FOR
EXAMPLE, WI−3,WI−4,WI+3,WI+4, HENCE THE NAME SKIP-GRAM.
• NOTE THAT THE WINDOW SIZE DETERMINES HOW FAR FORWARD
AND BACKWARD TO LOOK FOR CONTEXT WORDS TO PREDICT.
• Examples :
• From Jono's example, the sentence "Hi fred how
was the pizza?" becomes:
• Continuous bag of words: 3-grams {"Hi fred
how", "fred how was", "how was the", ...}
• Skip-gram 1-skip 3-grams: {"Hi fred how", "Hi
fred was", "fred how was", "fred how the", ...}
• Notice "Hi fred was" skips over "how". Those are
the general meaning of CBOW and skip gram. In
this case, skip gram is 1-skip n-grams.
Syntactical Relation Semantic Relation
HOW RNN USED FOR
NLP?
• Recurrent neural network (RNN) is a neural network
model proposed in the 80’s for modelling time series.
• The structure of the network is similar to feedforward
neural network, with the distinction that it allows a
recurrent hidden state whose activation at each time is
dependent on that of the previous time (cycle).
What is Recurrent Neural Network (RNN)?
• The time recurrence is introduced by relation for hidden
layer activity ht with its past hidden layer activity ht-1.
• This dependence is nonlinear because of using a logistic
function.
• A recursive neural network is a recurrent neural network
where the unfolded network given some finite input is
expressed as a (usually: binary) tree, instead of a "flat"
chain (as in the recurrent network).
• Recursive Neural Networks are exceptionally useful for
learning structured information
• Recursive Neural Networks are both:
• Architecturally Complex
• Computationally Expensive
What is Recursive Neural Network?
• Recursive Neural Network:
• A recursive neural network is more like a hierarchical
network where there is really no time aspect to the
input sequence but the input has to be processed
hierarchically in a tree fashion. Here is as example of how
a recursive neural network looks. It shows the way to
learn a parse tree of a sentence by recursively taking the
output of the operation performed on a smaller chunk of
the text.
What is Diff Between Recurrent vs Recursive NN?
• RECURRENT NEURAL NETWORK:
• A RECURRENT NEURAL NETWORK BASICALLY UNFOLDS OVER
TIME. IT IS USED FOR SEQUENTIAL INPUTS WHERE THE TIME
FACTOR IS THE MAIN DIFFERENTIATING FACTOR BETWEEN THE
ELEMENTS OF THE SEQUENCE. FOR EXAMPLE, HERE IS A
RECURRENT NEURAL NETWORK USED FOR LANGUAGE
MODELLING THAT HAS BEEN UNFOLDED OVER TIME. AT EACH
TIME STEP, IN ADDITION TO THE USER INPUT AT THAT TIME
STEP, IT ALSO ACCEPTS THE OUTPUT OF THE HIDDEN LAYER
THAT WAS COMPUTED AT THE PREVIOUS TIME STEP.
How Does Recurrent NN Work?
NN Formulation
NN Neuron ProcessingNN Parsing Tree
Sentence Parsing done through
• Character Level Modeling Through RNN:
• Objective: To train an RNN on the predicting next correct
sequence of a given word.
1. Word to be predicted – “Hello”
2. Character Level Vector = [h,e,l,o]
• Training Model:
1. The probability of “e” should be likely given the context of
“h”,
2. “l” should be likely in the context of “he”,
3. “l” should also be likely given the context of “hel”, and
finally
4. “o” should be likely given the context of “hell”.
How does Recurrent NN Work?
Word Level Modeling Through RNN:
What are Different Topologies for Recurrent
NN ?
Common Neural
Network (e.g.
feed forward
network)
Prediction of future
states base on single
observation
Machine translationSentiment
classification
Simultaneous
interpretation
HOW CNN USED FOR
NLP?
• Mimics neural processing of Biological Brain in order to analyze a
given data.
• Essentially neural networks that use convolution in place of general
matrix multiplication in at least one of their layers.
• Major Feature of CNN
• Locally Receptive Fields
• Shared Weights
• Spatial or Temporal Sub-sampling
• Consists of Three Major Parts
• Convolution
• Pooling
• Fully Connected NN
What is CNN?
Biologically Inspired
• Convolution Layer: Purpose of this is to
provide representation of data from different
view toward data. For this, it applies a
Kernel/Filter to Input Layer. Had Hyper
parameters like:
• Stride size decides how convolutional moves
over input layer.
• Convolutional with Zero padding called Wide
Convolution and Without it called Narrow
Convolution.
• Pooling Layer: Its main purpose is to provide
fixed Dimension Output Matrix of Convolution
layer for next layer’s Classification task. Pooling
layers subsample their input by Non-linear
down-sampling to simplify the information in
output from convolutional layer. (Max Pooling or
Average Pooling).
• Fully Connected Layer: Its main purpose to
provide classification layer using Fully
Connected Neural Networks.
How CNN Works?
Input
Layer
Convolution
Layer
Pooling
Layer
FullyConnected
Layer
• CNNs are also efficient in terms of representation. This
could be Good for Image but how good for NLP.
• Help in context analysis using Windows Approach to
scan sentences in n-gram fashion.
• Parallel to Image, NLP has Word Embedding
representation. To analyze context of words (words
before and after).
• N-Gram analysis for a large vocabulary can quickly
become compute intensive.
• This is where higher abstraction mechanism to
represent Word Embedding help.
• This is where CNN plays best.
How can CNN be used for NLP?
How can CNN be used for SentimentAnalysis Task?
PAD The movie was horrible PAD
7
3
2
5
1
9
−3
−3
1
1
1
1
0.5 0.2 −0.2 −0.9
0.5
Softmax-Classifer
Negative / Positive
Pooling - Capture the most
important activation
Pooling Layer
• Single Filter
• Window size: n=3
• Word Vectors:
• Weight Matrix:
• Bias:
Classification Layer
Input Layer
Convolution
Layer
Output Layer
• SEQ2SEQ NN for NLP
• Encoder & Decoder
• Memory
• LSTM & RNN for NLP
• Attention for NLP
What is further Deeper aspects of DL for NLP?
Thank You
Saurabh Kaushik

Más contenido relacionado

La actualidad más candente

Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
Mustafa Jarrar
 

La actualidad más candente (20)

Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
Deep learning for nlp
Deep learning for nlpDeep learning for nlp
Deep learning for nlp
 
Deep learning for natural language embeddings
Deep learning for natural language embeddingsDeep learning for natural language embeddings
Deep learning for natural language embeddings
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
 
Talk from NVidia Developer Connect
Talk from NVidia Developer ConnectTalk from NVidia Developer Connect
Talk from NVidia Developer Connect
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 
UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2
 
Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4Artificial Intelligence Notes Unit 4
Artificial Intelligence Notes Unit 4
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vec
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Introduction to natural language processing
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processing
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Frontiers of Natural Language Processing
Frontiers of Natural Language ProcessingFrontiers of Natural Language Processing
Frontiers of Natural Language Processing
 
Recent Advances in NLP
  Recent Advances in NLP  Recent Advances in NLP
Recent Advances in NLP
 
Deep Learning for NLP Applications
Deep Learning for NLP ApplicationsDeep Learning for NLP Applications
Deep Learning for NLP Applications
 
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
Deep Learning Architectures for NLP (Hungarian NLP Meetup 2016-09-07)
 
Representation Learning of Vectors of Words and Phrases
Representation Learning of Vectors of Words and PhrasesRepresentation Learning of Vectors of Words and Phrases
Representation Learning of Vectors of Words and Phrases
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
Word2Vec: Learning of word representations in a vector space - Di Mitri & Her...
 

Destacado

NLP& Bigdata. Motivation and Action
NLP& Bigdata. Motivation and ActionNLP& Bigdata. Motivation and Action
NLP& Bigdata. Motivation and Action
Sarath P R
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
BigDataCloud
 
Machine Learning for NLP
Machine Learning for NLPMachine Learning for NLP
Machine Learning for NLP
butest
 
Introducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with rIntroducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with r
Vivian S. Zhang
 

Destacado (20)

Winning Deals with Design Thinking
Winning Deals with Design Thinking Winning Deals with Design Thinking
Winning Deals with Design Thinking
 
Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning
 
NLP& Bigdata. Motivation and Action
NLP& Bigdata. Motivation and ActionNLP& Bigdata. Motivation and Action
NLP& Bigdata. Motivation and Action
 
NLP in English
NLP in EnglishNLP in English
NLP in English
 
NLP for Everyday People
NLP for Everyday PeopleNLP for Everyday People
NLP for Everyday People
 
Intro To NlP
Intro To NlPIntro To NlP
Intro To NlP
 
The Truth About Nlp & Hypnosis
The Truth About Nlp & HypnosisThe Truth About Nlp & Hypnosis
The Truth About Nlp & Hypnosis
 
Introduction to nlp 2014
Introduction to nlp 2014Introduction to nlp 2014
Introduction to nlp 2014
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
 
NIPS 2016 Highlights - Sebastian Ruder
NIPS 2016 Highlights - Sebastian RuderNIPS 2016 Highlights - Sebastian Ruder
NIPS 2016 Highlights - Sebastian Ruder
 
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
 
KiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorialKiwiPyCon 2014 - NLP with Python tutorial
KiwiPyCon 2014 - NLP with Python tutorial
 
Why Learn NLP or go on an NLP Training : Webinair
 Why Learn NLP or go on an NLP Training : Webinair Why Learn NLP or go on an NLP Training : Webinair
Why Learn NLP or go on an NLP Training : Webinair
 
Price iz NLP Centra - Pedja Jovanovic
Price iz NLP Centra - Pedja JovanovicPrice iz NLP Centra - Pedja Jovanovic
Price iz NLP Centra - Pedja Jovanovic
 
Learning Nlp
Learning NlpLearning Nlp
Learning Nlp
 
Language of Influence and Persuasion - introduction to the NLP Milton Model
Language of Influence and Persuasion - introduction to the NLP Milton ModelLanguage of Influence and Persuasion - introduction to the NLP Milton Model
Language of Influence and Persuasion - introduction to the NLP Milton Model
 
NLP for business analysts
NLP for business analystsNLP for business analysts
NLP for business analysts
 
Machine Learning for NLP
Machine Learning for NLPMachine Learning for NLP
Machine Learning for NLP
 
Introducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with rIntroducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with r
 
NLP pipeline in machine translation
NLP pipeline in machine translationNLP pipeline in machine translation
NLP pipeline in machine translation
 

Similar a Engineering Intelligent NLP Applications Using Deep Learning – Part 2

Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Abdullah al Mamun
 

Similar a Engineering Intelligent NLP Applications Using Deep Learning – Part 2 (20)

Deep Learning for Machine Translation
Deep Learning for Machine TranslationDeep Learning for Machine Translation
Deep Learning for Machine Translation
 
Building a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From ScratchBuilding a Neural Machine Translation System From Scratch
Building a Neural Machine Translation System From Scratch
 
AINL 2016: Nikolenko
AINL 2016: NikolenkoAINL 2016: Nikolenko
AINL 2016: Nikolenko
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
An Introduction to Recent Advances in the Field of NLP
An Introduction to Recent Advances in the Field of NLPAn Introduction to Recent Advances in the Field of NLP
An Introduction to Recent Advances in the Field of NLP
 
Beyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLPBeyond the Symbols: A 30-minute Overview of NLP
Beyond the Symbols: A 30-minute Overview of NLP
 
Natural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A SurveyNatural Language Processing Advancements By Deep Learning: A Survey
Natural Language Processing Advancements By Deep Learning: A Survey
 
NLP and Deep Learning for non_experts
NLP and Deep Learning for non_expertsNLP and Deep Learning for non_experts
NLP and Deep Learning for non_experts
 
Deep Learning and Modern Natural Language Processing (AnacondaCon2019)
Deep Learning and Modern Natural Language Processing (AnacondaCon2019)Deep Learning and Modern Natural Language Processing (AnacondaCon2019)
Deep Learning and Modern Natural Language Processing (AnacondaCon2019)
 
Recurrent Neural Network
Recurrent Neural NetworkRecurrent Neural Network
Recurrent Neural Network
 
Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...
Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...
Lecture 9 - Deep Sequence Models, Learn Recurrent Neural Networks (RNN), GRU ...
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Neural Networks with Focus on Language Modeling
Neural Networks with Focus on Language ModelingNeural Networks with Focus on Language Modeling
Neural Networks with Focus on Language Modeling
 
Natural Language Processing Advancements By Deep Learning - A Survey
Natural Language Processing Advancements By Deep Learning - A SurveyNatural Language Processing Advancements By Deep Learning - A Survey
Natural Language Processing Advancements By Deep Learning - A Survey
 
Word embedding
Word embedding Word embedding
Word embedding
 
wordembedding.pptx
wordembedding.pptxwordembedding.pptx
wordembedding.pptx
 
Word_Embedding.pptx
Word_Embedding.pptxWord_Embedding.pptx
Word_Embedding.pptx
 
Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...Natural Language Processing, Techniques, Current Trends and Applications in I...
Natural Language Processing, Techniques, Current Trends and Applications in I...
 
Short story presentation
Short story presentationShort story presentation
Short story presentation
 
Turkish language modeling using BERT
Turkish language modeling using BERTTurkish language modeling using BERT
Turkish language modeling using BERT
 

Más de Saurabh Kaushik

A Consulting Model - Strategic Digital marketing
A Consulting Model - Strategic Digital marketingA Consulting Model - Strategic Digital marketing
A Consulting Model - Strategic Digital marketing
Saurabh Kaushik
 

Más de Saurabh Kaushik (9)

MLOps with Kubeflow
MLOps with Kubeflow MLOps with Kubeflow
MLOps with Kubeflow
 
Building AI Product using AI Product Thinking
Building AI Product using AI Product Thinking Building AI Product using AI Product Thinking
Building AI Product using AI Product Thinking
 
AI Product Thinking for Product Managers
AI Product Thinking for Product Managers AI Product Thinking for Product Managers
AI Product Thinking for Product Managers
 
Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective
 
Project Management Using Design Thinking
Project Management Using Design Thinking Project Management Using Design Thinking
Project Management Using Design Thinking
 
Design Thinking - Case Studies
Design Thinking  - Case Studies Design Thinking  - Case Studies
Design Thinking - Case Studies
 
An Assessment Framework for Strategic Digital Marketing Effectiveness
An Assessment Framework for Strategic Digital Marketing EffectivenessAn Assessment Framework for Strategic Digital Marketing Effectiveness
An Assessment Framework for Strategic Digital Marketing Effectiveness
 
A Consulting Model - Strategic Digital marketing
A Consulting Model - Strategic Digital marketingA Consulting Model - Strategic Digital marketing
A Consulting Model - Strategic Digital marketing
 
Air Pollution Control by Tax and Subsidies
Air Pollution Control by Tax and Subsidies Air Pollution Control by Tax and Subsidies
Air Pollution Control by Tax and Subsidies
 

Último

Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Último (20)

Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
NFPA 5000 2024 standard .
NFPA 5000 2024 standard                                  .NFPA 5000 2024 standard                                  .
NFPA 5000 2024 standard .
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced LoadsFEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
FEA Based Level 3 Assessment of Deformed Tanks with Fluid Induced Loads
 
Intro To Electric Vehicles PDF Notes.pdf
Intro To Electric Vehicles PDF Notes.pdfIntro To Electric Vehicles PDF Notes.pdf
Intro To Electric Vehicles PDF Notes.pdf
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank  Design by Working Stress - IS Method.pdfIntze Overhead Water Tank  Design by Working Stress - IS Method.pdf
Intze Overhead Water Tank Design by Working Stress - IS Method.pdf
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 

Engineering Intelligent NLP Applications Using Deep Learning – Part 2

  • 1. Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Saurabh Kaushik
  • 2. • Part 1: • Why NLP? • What is NLP? • What is the Word & Sentence Modelling in NLP? • What is Word Representation in NLP? • What is Language Processing in NLP? Agenda • PART 2 : • WHY DL FOR NLP? • WHAT IS DL? • WHAT IS DL FOR NLP? • HOW RNN WORKS FOR NLP? • HOW CNN WORKS FOR NLP?
  • 3. WHY DL FOR NLP?
  • 4. Why DL for NLP? • The majority of traditional, rule based natural language processing procedures represent words was “One-Hot” encoded vectors. Words as “One-Hot” Vectors • A lot value in NLP comes from understanding a word in relation to its neighbors and their syntactical relationship. Lack of Lexical Semantics • Bag of Words models, including TF-IDF models, cannot distinguish certain contexts Problems with Bag of Words • Two different words will have no interaction between them • “One Hot” will compute enormously long vectors for large corpus. • Traditional models largely focus on syntactic level representations instead of semantic level representations • Sentiment analysis can be easy for longer corpus. • However, for dataset of single sentence movie reviews (Pang and Lee, 2005) accuracy never reached above 80% for >7 years
  • 5. • if wi.form == ‘John’: • wi.pos = ‘noun’ • if wi.form == ‘majors’: • wi.pos = ‘noun’ • if wi.form == ‘majors’ and wi-1.form == ‘two’ • wi.pos = ‘noun’ • if wi.form == ‘studies’ and wi-1.pos == ‘num’ • wi.pos = ‘noun’ What is Rule Based Approach? Find the part-of-speech tag of each word. Good Really Too Specific Keep Doing this Img: Jinho D. Choi – Machine Learning in NLP PPT
  • 6. • Algorithm: 1. Gather as much LABELED data as you can get 2. Throw some algorithms at it (mainly put in an SVM and keep it at that) 3. If you actually have tried more algos: Pick the best 4. Spend hours hand engineering some features / feature selection / dimensionality reduction (PCA, SVD, etc) 5. Repeat… What is Machine Learning Approach? Machine Learning – Algo & Arch Feature Engineering: • Hand crafting features for given text. Process also called, Feature Engineering. • Feature Engineering: Functions which transform input (raw) data into a feature space. • Discriminative – for decision boundary
  • 7. • NLP tasks often deal with 1 ~ 10 million features. • These feature vectors are very sparse. • The values in these vectors are often binary. • Many features are redundant in some way. • Feature selection takes a long time. • Is machine learning easier or harder for NLP? What is Machine Learning Approach? Extract features for each word. Convert string features into vector. Img: Jinho D. Choi – Machine Learning in NLP PPT
  • 8. • Machine Learning is about: • Features: In ML, feature engineering is explicit process and mostly manual (programmatically). It is painful, over-specified and often incomplete. And take longer time to design and validate. • Representation: ML has specific framework for Word Representation based on problem set and its algo. • Learning: It is mostly Supervised Learning. Why DL vs ML? • DEEP LEARNING IS ABOUT: • FEATURES : IDENTIFY & LEARN FEATURES AUTOMATICALLY. LEARNED FEATURES ARE EASY TO ADAPT AND FAST TO LEARN. • REPRESENTATION: DL PROVIDES A VERY FLEXIBLE UNIVERSAL, (ALMOST) LEARNABLE FRAMEWORK FOR REPRESENTING WORDS, VISUAL AND LINGUISTIC INFORMATION. • LEARNING: DL LEARN FROM BOTH SUPERVISED (FROM RAW TEXT, IMAGE, AUDIO CONTENTS) AND UNSUPERVISED (SENTIMENTAL, POS TAGGED) DATA. NER POS WordNet WordNet
  • 9. How Classical different from Deep Learning for NLP?
  • 10. • Learning Representation • Handcrafted feature time consuming. Incomplete and Over Specification • Need to be done from each specific domain data. • Need for Distributional Similarity & Distributed Representation • Current NLP systems are incredibly fragile because of their atomic symbol representation. • Unsupervised features and Weight Training • Most NLP & ML tech requires labelled data (supervised learning). • Learning multiple levels of representation • Successive model layers learn deeper intermediate representations • Language is composed of words and phrases. Need Compositionality in ML Models. • Recursion: the same operator (word feature) is applied repeatedly on different component (words in sentences). • Why Now? • New methods of supervised pre-training • More efficient Parameter Estimation • Better understanding of Parameter Regularization What are other major Reason for Exploring DL for NLP?
  • 11. Where can Dl be applied for NLP tasks? DL Algorithms NLP Usage Neural Network (NN) - Feed forward • POS, NER, Chunking • Entity and Intent Extraction Recurrent Neural Networks (RNN) • Language Modeling and Generating Text • Machine Translation • Question Answering System • Image Captioning - Generating Image Descriptions Recursive Neural Networks • Parsing Sentences • Sentiment Analysis • Paraphrase Detection • Relation Classification • Object Detection Convolutional Neural Network (CNN) • Sentence / Text Classification • Relation Extraction and Classification • Sentiment classification; Spam Detection or Topic Categorization. • Classification of Search Queries • Semantic relation extraction
  • 13. • In Human Neuron: • A neuron: many-inputs / one-output unit • Output can be excited or not excited • Incoming signals from other neurons determine if the neuron shall excite ("fire") • Output subject to attenuation in the synapses, which are junction parts of the neuron What is Neural Network? • IN COMPUTER NEURON: 1. TAKES THE INPUTS . 2. CALCULATE THE SUMMATION OF THE INPUTS . 3. COMPARE IT WITH THE THRESHOLD BEING SET DURING THE LEARNING STAGE. • Artificial Neural Network are designed to solve any problem by trying to mimic the structure and the function of our nervous system. • Neural Network are based on simulated neurons, which are joined together in a variety of ways to form a network. • Neural Network resembles human brain following two ways. • NN acquires Knowledge through Learning • This Knowledge is stored in Interconnection strength, called Synaptic Weight.
  • 14. • In Logistic Regression based NN, • X : Input parameter at each node • B : Bias parameter at each node • W : Weight at each node • H(x) : Output function at each node • A : Activation Function at each node What is Neural Network? • NEURON - LOGISTIC REGRESSION OR SIMILAR FUNCTION • BIAS UNIT – INTERCEPT TERM/ALWAYS ON FEATURE • ACTIVATION FUNCTION – LOGISTIC RESPONSE (SIGMOID FOR NON- LINEARITY) • FEED FORWARD - RUNNING STOCHASTIC GRADIENT ASCENDS FORWARD LAYER BY LAYER • BACKPROPAGATION – RUNNING STOCHASTIC GRADIENT DESCENDS BACKWARD LAYER BY LAYER • WEIGHT DECAY – REGULARIZATION / BAYESIAN PRIOR Multi Layer Neural Network Neuron Node Compute Function Neural Node Components Single Layer Neural Network
  • 15. • First model developed using NN was meant to show the advantage of using distributed representations to beat state-of-the-art statistical language models (smoothed n-grams). • Done in 2003, this NN consists of a one-hidden layer feed-forward neural network that predicts the next word in a sequence. It is called Neural Probabilistic Language Model . • Output of Model: f(wt,wt−1,⋯,wt−n+1) • Probability: p(wt|wt−1,⋯,wt−n+1) • The general building blocks of their model, however, are still found in all current neural language and word embedding models. These are: • Embedding Layer: a layer that generates word embeddings by multiplying an index vector with a word embedding matrix; • Intermediate Layer(s): one or more layers that produce an intermediate representation of the input, e.g. a fully-connected layer that applies a non- linearity to the concatenation of word embeddings of n previous words; • Softmax Layer: the final layer that produces a probability distribution over words in V. How NN can be used in NLP? Ref: https://www.iro.umontreal.ca/~bengioy/yoshua_en/research.html Classic neural language model (Bengio et al., 2003)
  • 16. • CBOW (Common Bag of Words): • The input to the model could be wi−2,wi−1,wi+1,wi+2, the preceding and following words of the current word we are at. The output of the neural network will be wi. Hence you can think of the task as "predicting the word given its context" • Note that the number of words we use depends on your setting for the window size. How to get Syntactical and Semantic Relationship using DL? • SKIP-GRAM: • THE INPUT TO THE MODEL IS WI, AND THE OUTPUT COULD BE WI−1,WI−2,WI+1,WI+2. SO THE TASK HERE IS "PREDICTING THE CONTEXT GIVEN A WORD". ALSO, THE CONTEXT IS NOT LIMITED TO ITS IMMEDIATE CONTEXT, TRAINING INSTANCES CAN BE CREATED BY SKIPPING A CONSTANT NUMBER OF WORDS IN ITS CONTEXT, SO FOR EXAMPLE, WI−3,WI−4,WI+3,WI+4, HENCE THE NAME SKIP-GRAM. • NOTE THAT THE WINDOW SIZE DETERMINES HOW FAR FORWARD AND BACKWARD TO LOOK FOR CONTEXT WORDS TO PREDICT. • Examples : • From Jono's example, the sentence "Hi fred how was the pizza?" becomes: • Continuous bag of words: 3-grams {"Hi fred how", "fred how was", "how was the", ...} • Skip-gram 1-skip 3-grams: {"Hi fred how", "Hi fred was", "fred how was", "fred how the", ...} • Notice "Hi fred was" skips over "how". Those are the general meaning of CBOW and skip gram. In this case, skip gram is 1-skip n-grams. Syntactical Relation Semantic Relation
  • 17. HOW RNN USED FOR NLP?
  • 18. • Recurrent neural network (RNN) is a neural network model proposed in the 80’s for modelling time series. • The structure of the network is similar to feedforward neural network, with the distinction that it allows a recurrent hidden state whose activation at each time is dependent on that of the previous time (cycle). What is Recurrent Neural Network (RNN)? • The time recurrence is introduced by relation for hidden layer activity ht with its past hidden layer activity ht-1. • This dependence is nonlinear because of using a logistic function.
  • 19. • A recursive neural network is a recurrent neural network where the unfolded network given some finite input is expressed as a (usually: binary) tree, instead of a "flat" chain (as in the recurrent network). • Recursive Neural Networks are exceptionally useful for learning structured information • Recursive Neural Networks are both: • Architecturally Complex • Computationally Expensive What is Recursive Neural Network?
  • 20. • Recursive Neural Network: • A recursive neural network is more like a hierarchical network where there is really no time aspect to the input sequence but the input has to be processed hierarchically in a tree fashion. Here is as example of how a recursive neural network looks. It shows the way to learn a parse tree of a sentence by recursively taking the output of the operation performed on a smaller chunk of the text. What is Diff Between Recurrent vs Recursive NN? • RECURRENT NEURAL NETWORK: • A RECURRENT NEURAL NETWORK BASICALLY UNFOLDS OVER TIME. IT IS USED FOR SEQUENTIAL INPUTS WHERE THE TIME FACTOR IS THE MAIN DIFFERENTIATING FACTOR BETWEEN THE ELEMENTS OF THE SEQUENCE. FOR EXAMPLE, HERE IS A RECURRENT NEURAL NETWORK USED FOR LANGUAGE MODELLING THAT HAS BEEN UNFOLDED OVER TIME. AT EACH TIME STEP, IN ADDITION TO THE USER INPUT AT THAT TIME STEP, IT ALSO ACCEPTS THE OUTPUT OF THE HIDDEN LAYER THAT WAS COMPUTED AT THE PREVIOUS TIME STEP.
  • 21. How Does Recurrent NN Work? NN Formulation NN Neuron ProcessingNN Parsing Tree Sentence Parsing done through
  • 22. • Character Level Modeling Through RNN: • Objective: To train an RNN on the predicting next correct sequence of a given word. 1. Word to be predicted – “Hello” 2. Character Level Vector = [h,e,l,o] • Training Model: 1. The probability of “e” should be likely given the context of “h”, 2. “l” should be likely in the context of “he”, 3. “l” should also be likely given the context of “hel”, and finally 4. “o” should be likely given the context of “hell”. How does Recurrent NN Work? Word Level Modeling Through RNN:
  • 23. What are Different Topologies for Recurrent NN ? Common Neural Network (e.g. feed forward network) Prediction of future states base on single observation Machine translationSentiment classification Simultaneous interpretation
  • 24. HOW CNN USED FOR NLP?
  • 25. • Mimics neural processing of Biological Brain in order to analyze a given data. • Essentially neural networks that use convolution in place of general matrix multiplication in at least one of their layers. • Major Feature of CNN • Locally Receptive Fields • Shared Weights • Spatial or Temporal Sub-sampling • Consists of Three Major Parts • Convolution • Pooling • Fully Connected NN What is CNN? Biologically Inspired
  • 26. • Convolution Layer: Purpose of this is to provide representation of data from different view toward data. For this, it applies a Kernel/Filter to Input Layer. Had Hyper parameters like: • Stride size decides how convolutional moves over input layer. • Convolutional with Zero padding called Wide Convolution and Without it called Narrow Convolution. • Pooling Layer: Its main purpose is to provide fixed Dimension Output Matrix of Convolution layer for next layer’s Classification task. Pooling layers subsample their input by Non-linear down-sampling to simplify the information in output from convolutional layer. (Max Pooling or Average Pooling). • Fully Connected Layer: Its main purpose to provide classification layer using Fully Connected Neural Networks. How CNN Works? Input Layer Convolution Layer Pooling Layer FullyConnected Layer
  • 27. • CNNs are also efficient in terms of representation. This could be Good for Image but how good for NLP. • Help in context analysis using Windows Approach to scan sentences in n-gram fashion. • Parallel to Image, NLP has Word Embedding representation. To analyze context of words (words before and after). • N-Gram analysis for a large vocabulary can quickly become compute intensive. • This is where higher abstraction mechanism to represent Word Embedding help. • This is where CNN plays best. How can CNN be used for NLP?
  • 28. How can CNN be used for SentimentAnalysis Task? PAD The movie was horrible PAD 7 3 2 5 1 9 −3 −3 1 1 1 1 0.5 0.2 −0.2 −0.9 0.5 Softmax-Classifer Negative / Positive Pooling - Capture the most important activation Pooling Layer • Single Filter • Window size: n=3 • Word Vectors: • Weight Matrix: • Bias: Classification Layer Input Layer Convolution Layer Output Layer
  • 29. • SEQ2SEQ NN for NLP • Encoder & Decoder • Memory • LSTM & RNN for NLP • Attention for NLP What is further Deeper aspects of DL for NLP?