Parsing Natural Scenes and Natural Language with Recursive Neural Networks

•

7 recomendaciones•2,786 vistas

jie cao

Richard Socher, ICML 2011, RNN, deep learning

Datos y análisis

Parsing Natural Scenes and
Natural Language with
Recursive Neural Networks
Richard Socher, Cliff Chiung-Yu Lin, Andrew Y. Ng, Christopher D. Manning
ICML’ 2011
Jie Cao

Outline
• Context
• Recursive Neural Network Deﬁnition
• Input Representation
• Output
• Greedy Structure Predicting RNNs
• Loss Function
• Max-Margin Framework
• Back propagation Through Structure
• L-BFGS
• Experiment and Improved RNN

Word Embedding Matrix
dense vector 
co-occurrence statistic
Collobert, R. and Weston, J. A uniﬁed architecture for natural language processing: deep neural networks with multitask learning. In ICML, 2008

Input Representation for
Scene Image
the features
each segment i = 1,...,Nsegs in an image
the matrix of parameters
we want to learn
bias
applied element-wise,
can be any sigmoid-like function，original one
“semantic”
n-dimensional space
78 segments per image
119 features for every segement
Gould, S., Fulton, R., and Koller, D. Decomposing a Scene into Geometric and Semantically Consistent Regions. In ICCV, 2009

$f: X→Y (Output Y ) • For Visual Parser: • A visual tree is correct if all adjacent segments that belong to the same class(all segments labeled) are merged into one super segment before merges occur with super segments of different classes. • how object parts are internally merged or how complete, neighboring objects are merged into the full scene image • A set of correct trees • For Language Parser: • only has one element, the annotated ground truth tree: Y (x) = {y} How to evaluate to error between Y_true and Y’? (Loss Function)$

Recursive NN Deﬁnition
new presentation of parent(i,j)
new score of parent(i,j)
C recursively adding
new merged parent,
and update the adjacent matrix
Potential Adjacent Pairs

Category Classify in RNN
Each node of the tree built by the RNN has associated
with it a distributed feature representation
We can leverage this representation by adding to
each RNN parent node (after removing the scoring layer)
a simple softmax layer to predict class labels

Loss Function for Language
For Constituency Parser:(Phrase Structure Parser)
A constituent(non-terminal) is correct only if :
1. it dominates exactly the correct span of words
2. it is the correct type of constituent  
(S[1:7]
(NP[1:1] Jim)
(VP[2:2] ate)
(NP[3:4] the cookies)
(PP[5:7] in
(NP[6:7] the bowl)
)
)
(S[1:7]
(NP[1:1] Jim)
(VP[2:7] ate
(NP[3:7] the cookies
(PP[5:7] in
(NP[6:7] the bowl)
)
)
)
)
Hamming Distance

Loss Function for Image
For Visual Parser: A set of correct trees
for proposing a parse yˆ for input x with labels l

RNN for Structure Prediction
Given the training set, we search for a function f
with small expected loss on unseen inputs.
T(x) is the set of possibly correct trees.
Assuming this problem can be described in terms of a
computationally tractable max over a score function s
How to deﬁne the margin?

Max Margin
Hard-Margin:
Soft-Margin:
Adding a slack to handle not separable data
We need to minimize as the hinge loss
max for true Y is because not only one true tree for image
Max

Experiment in ICML’2011
The ﬁnal unlabeled bracketing F-measure of our language
parser is 90.29%, compared to 91.63% for the widely
used Berkeley parser (Petrov et al., 2006) (development
F1 is virtually identical with 92.06% for the RNN and
92.08% for the Berkeley parser).
Unlike most previous systems, our parser does not provide
a parent with information about the syntactic categories of
its children. This shows that our learned, continuous
representations capture enough syntactic information to
make good parsing decisions.

Allow different W for different
pairs syntactic categories

Más contenido relacionado

La actualidad más candente

Tutorial on convolutional neural networksHojin Yang

Pr045 deep lab_semantic_segmentationTaeoh Kim

A Framework of Secured and Bio-Inspired Image Steganography Using Chaotic Enc...Varun Ojha

AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.GeeksLab Odessa

Deep Learning for Computer Vision: Attention Models (UPC 2016)Universitat Politècnica de Catalunya

Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017Universitat Politècnica de Catalunya

convolutional neural network (CNN, or ConvNet)RakeshSaran5

Fuzzy Encoding For Image Classification Using Gustafson-Kessel AglorithmAshish Gupta

Learning Convolutional Neural Networks for GraphsMathias Niepert

Illustrative Introductory CNNYasutoTamura1

Neuroevolution and deep learing Accenture

Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018Universitat Politècnica de Catalunya

Deep Learning: Recurrent Neural Network (Chapter 10) Larry Guo

Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya

Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Universitat Politècnica de Catalunya

Modern Convolutional Neural Network techniques for image segmentationGioele Ciaparrone

DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...Joonhyung Lee

PR-132: SSD: Single Shot MultiBox DetectorJinwon Lee

Visualizing and Understanding Convolutional NetworksWilly Marroquin (WillyDevNET)

CNNUkjae Jeong

La actualidad más candente (20)

Tutorial on convolutional neural networks

Pr045 deep lab_semantic_segmentation

A Framework of Secured and Bio-Inspired Image Steganography Using Chaotic Enc...

AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.

Deep Learning for Computer Vision: Attention Models (UPC 2016)

Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017

convolutional neural network (CNN, or ConvNet)

Fuzzy Encoding For Image Classification Using Gustafson-Kessel Aglorithm

Learning Convolutional Neural Networks for Graphs

Illustrative Introductory CNN

Neuroevolution and deep learing

Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018

Deep Learning: Recurrent Neural Network (Chapter 10)

Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)

Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...

Modern Convolutional Neural Network techniques for image segmentation

DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...

PR-132: SSD: Single Shot MultiBox Detector

Visualizing and Understanding Convolutional Networks

CNN

Similar a Parsing Natural Scenes and Natural Language with Recursive Neural Networks

Generating super resolution images using transformersNEERAJ BAGHEL

Visualizaing and understanding convolutional networksSungminYou

Disentangled Representation Learning of Deep Generative ModelsRyohei Suzuki

conv_nets.pptxssuser80a05c

Analysis of Image Compression Using WaveletIOSR Journals

2019-06-14:6 - Reti neurali e compressione immagineuninfoit

Convolutional Neural Networks (CNN)Gaurav Mittal

Object detection - RCNNs vs RetinanetRishabh Indoria

Chebyshev Functional Link Artificial Neural Networks for Denoising of Image C...IDES Editor

Attentive semantic alignment with offset aware correlation kernelsNAVER Engineering

Random Matrix Theory and Machine Learning - Part 4Fabian Pedregosa

190 195Editor IJARCET

Final PosterElizabeth Koshelev

ECCV2010: feature learning for image classification, part 4zukun

Talk from NVidia Developer ConnectAnuj Gupta

New Approach of Preprocessing For Numeral RecognitionIJERA Editor

Neural networks and deep learningRADO7900

Deep learning from a novice perspectiveAnirban Santara

Mnist report pptRaghunandanJairam

Similar a Parsing Natural Scenes and Natural Language with Recursive Neural Networks (20)

Generating super resolution images using transformers

Visualizaing and understanding convolutional networks

Disentangled Representation Learning of Deep Generative Models

conv_nets.pptx

Analysis of Image Compression Using Wavelet

2019-06-14:6 - Reti neurali e compressione immagine

Convolutional Neural Networks (CNN)

Object detection - RCNNs vs Retinanet

Chebyshev Functional Link Artificial Neural Networks for Denoising of Image C...

Attentive semantic alignment with offset aware correlation kernels

Random Matrix Theory and Machine Learning - Part 4

190 195

Final Poster

ECCV2010: feature learning for image classification, part 4

Talk from NVidia Developer Connect

New Approach of Preprocessing For Numeral Recognition

Neural networks and deep learning

Deep learning from a novice perspective

Mnist report ppt

Más de jie cao

Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codesjie cao

A Comparative Study on Schema-guided Dialog State Trackingjie cao

Task-oriented Conversational semantic parsingjie cao

CCGjie cao

Talking Geckos (Question and Answering)jie cao

Spark调研串讲jie cao

Challenges on Distributed Machine Learningjie cao

Más de jie cao (7)

Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes

A Comparative Study on Schema-guided Dialog State Tracking

Task-oriented Conversational semantic parsing

CCG

Talking Geckos (Question and Answering)

Spark调研串讲

Challenges on Distributed Machine Learning

Último

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE9953056974 Low Rate Call Girls In Saket, Delhi NCR

Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823

Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083

(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7Call Girls in Nagpur High Profile Call Girls

April 2024 - Crypto Market Report's Analysismanisha194592

BigBuy dropshipping via API with DroFx.pptxolyaivanovalion

Discover Why Less is More in B2B Researchmichael115558

Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal

Introduction-to-Machine-Learning (1).pptxfirstjob4

Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson

Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls

Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083

Invezz.com - Grow your wealth with trading signalsInvezz1

Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692

Carero dropshipping via API with DroFx.pptxolyaivanovalion

Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila

Parsing Natural Scenes and Natural Language with Recursive Neural Networks

1. Parsing Natural Scenes and Natural Language with Recursive Neural Networks Richard Socher, Cliff Chiung-Yu Lin, Andrew Y. Ng, Christopher D. Manning ICML’ 2011 Jie Cao

2. Outline • Context • Recursive Neural Network Deﬁnition • Input Representation • Output • Greedy Structure Predicting RNNs • Loss Function • Max-Margin Framework • Back propagation Through Structure • L-BFGS • Experiment and Improved RNN

3. Recursive vs Recurrent NN

5. f: X→Y (Input X )

6. Map Phrase into Vector Space

7. Word Embedding Matrix dense vector  co-occurrence statistic Collobert, R. and Weston, J. A uniﬁed architecture for natural language processing: deep neural networks with multitask learning. In ICML, 2008

8. Input Representation for Scene Image the features each segment i = 1,...,Nsegs in an image the matrix of parameters we want to learn bias applied element-wise, can be any sigmoid-like function，original one “semantic” n-dimensional space 78 segments per image 119 features for every segement Gould, S., Fulton, R., and Koller, D. Decomposing a Scene into Geometric and Semantically Consistent Regions. In ICCV, 2009

9. f: X→Y (Output Y ) • For Visual Parser: • A visual tree is correct if all adjacent segments that belong to the same class(all segments labeled) are merged into one super segment before merges occur with super segments of different classes. • how object parts are internally merged or how complete, neighboring objects are merged into the full scene image • A set of correct trees • For Language Parser: • only has one element, the annotated ground truth tree: Y (x) = {y} How to evaluate to error between Y_true and Y’? (Loss Function)

10. Recursive NN Deﬁnition new presentation of parent(i,j) new score of parent(i,j) C recursively adding new merged parent, and update the adjacent matrix Potential Adjacent Pairs

11. Greedy Structure Predicting RNNs

12. Greedy Structure Predicting RNNs

13. Greedy Structure Predicting RNNs

14. Parsing a sentence

15. Category Classify in RNN Each node of the tree built by the RNN has associated with it a distributed feature representation We can leverage this representation by adding to each RNN parent node (after removing the scoring layer) a simple softmax layer to predict class labels

16.

17. Loss Function for Language For Constituency Parser:(Phrase Structure Parser) A constituent(non-terminal) is correct only if : 1. it dominates exactly the correct span of words 2. it is the correct type of constituent   (S[1:7] (NP[1:1] Jim) (VP[2:2] ate) (NP[3:4] the cookies) (PP[5:7] in (NP[6:7] the bowl) ) ) (S[1:7] (NP[1:1] Jim) (VP[2:7] ate (NP[3:7] the cookies (PP[5:7] in (NP[6:7] the bowl) ) ) ) ) Hamming Distance

18. Loss Function for Image For Visual Parser: A set of correct trees for proposing a parse yˆ for input x with labels l

19. RNN for Structure Prediction Given the training set, we search for a function f with small expected loss on unseen inputs. T(x) is the set of possibly correct trees. Assuming this problem can be described in terms of a computationally tractable max over a score function s How to deﬁne the margin?

20. Max Margin Hard-Margin: Soft-Margin: Adding a slack to handle not separable data We need to minimize as the hinge loss max for true Y is because not only one true tree for image Max

21. Max-Margin Framework

22. Backpropagation Through Structure

23. cho

24.

25.

26.

27. Experiment in ICML’2011 The ﬁnal unlabeled bracketing F-measure of our language parser is 90.29%, compared to 91.63% for the widely used Berkeley parser (Petrov et al., 2006) (development F1 is virtually identical with 92.06% for the RNN and 92.08% for the Berkeley parser). Unlike most previous systems, our parser does not provide a parent with information about the syntactic categories of its children. This shows that our learned, continuous representations capture enough syntactic information to make good parsing decisions.

28. Experiment

29.

30.

31. Allow different W for different pairs syntactic categories

32. Thanks

Parsing Natural Scenes and Natural Language with Recursive Neural Networks

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Parsing Natural Scenes and Natural Language with Recursive Neural Networks

Similar a Parsing Natural Scenes and Natural Language with Recursive Neural Networks (20)

Más de jie cao

Más de jie cao (7)

Último

Último (20)

Parsing Natural Scenes and Natural Language with Recursive Neural Networks