SlideShare una empresa de Scribd logo
1 de 1
Descargar para leer sin conexión
Enhanced LSTM for Natural Language Inference
Qian Chen1
, Xiaodan Zhu23
, Zhen-Hua Ling1
,
Si Wei4
, Hui Jiang5
, Diana Inkpen6
1
University of Science and Technology of China
2
National Research Council Canada 3
Queen’s University
4
iFLYTEK Research 5
York University 6
University of Ottawa
Contributions
F Propose a hybrid neural network
model for natural language inference.
? The model achieves the best results
on the SNLI dataset.
F Our first component, Enhanced
Sequential Inference Model (ESIM),
has outperformed the previous best
results.
F Further using tree-LSTM [Zhu, ICML-
2015, Tai, ACL-2015, Le, *SEM-2015] to en-
code syntactic parses can improve the
performance additionally.
Source code available!!!
https://github.com/lukecq1231/nli
Our implementation uses python and is
based on the Theano library.
An example
Natural language inference (NLI) aims
to determine whether a natural-language
hypothesis H can be inferred from a
premise P.
• Premise: A woman wearing a black
dress and green sweater is walking
down the street looking at her cell-
phone.
• Hypothesis 1: A woman is holding
her cell phone. (Entailment)
• Hypothesis 2: A woman is looking
at a text on her cell phone. (Neutral)
• Hypothesis 3: A woman has her cell
phone up to her ear. (Contradiction)
Analysis
Figure 1: Attention visualization of stand-alone
syntactic tree-LSTM model
Hybrid Neural Inference Model
Figure 2: A high-level view of our Enhanced
Sequential Inference Model (ESIM)
1. Input Encoding
Premise: xp
1, xp
2, . . . , xp
N
Hypothesis: xh
1 , xh
2 , . . . , xh
M
Embedding matrix: E 2 RV ⇥Dm
hp
= Enc(E(xp
1), . . . , E(xp
N )) 2 RN⇥De
(1)
hh
= Enc(E(xh
1 ), . . . , E(xh
M )) 2 RM⇥De
(2)
where Enc is BiLSTM or Tree-LSTM [ZSG15,
TSM15]. Here Enc learns to represent a word
(or phrase) and its context.
2. Local Inference Modeling
• Local inference collected
eij = (hp
i )T
hh
j , e 2 RN⇥M
(3)
¯hp
i =
X
j
exp(eij)
P
k exp(eik)
hh
j , ¯hp 2 RN⇥De
(4)
¯hh
j =
X
i
exp(eij)
P
k exp(ekj)
hp
i , ¯hh 2 RM⇥De
(5)
Figure 3: A high-level view of our Hybrid Inference
Model (HIM)
Intuitively, the content in hh
that is relevant to
hp
i will be selected and represented as ¯hp
i , and
vice versa.
• Enhancement of local inference information
mp
= [hp
; ¯hp; hp ¯hp; hp ¯hp] 2 RN⇥4De
(6)
mh
= [hh
; ¯hh; hh ¯hh; hh ¯hh] 2 RM⇥4De
(7)
3. Inference Composition
vp
= Cmp(mp
1, . . . , mp
N ) 2 RN⇥Dc
(8)
vh
= Cmp(mh
1 , . . . , mg
M ) 2 RM⇥Dc
(9)
v = [max(vp
); ave(vp
); max(vh
); ave(vh
)] 2 R4Dc
(10)
where Cmp is BiLSTM or Tree-LSTM. Finally,
we put v into a final MLP classifier.
Results
• Data: Stanford Natural Language Inference
(SNLI) (Training: 550k sentence pairs, held-
out: 10k, testing: 10k)
Table 1: Accuracies of the models on SNLI
Model Test
(1) Handcrafted features [BAPM15] 78.2
(2) LSTM [BGR+
16] 80.6
(3) GRU [VKFU15] 81.4
(4) Tree CNN [MML+
16] 82.1
(5) SPINN-PI [BGR+
16] 83.2
(6) BiLSTM intra-Att [LSLW16] 84.2
(7) NSE [MY16a] 84.6
(8) Att-LSTM [RGH+
15] 83.5
(9) mLSTM [WJ16] 86.1
(10) LSTMN [CDL16] 86.3
(11) Decomposable Att [PTDU16] 86.3
(12) Intra-sent Att+(11) [PTDU16] 86.8
(13) NTI-SLSTM-LSTM [MY16b] 87.3
(14) Re-read LSTM [SCSL16] 87.5
(15) btree-LSTM [PAD+
16] 87.6
(16) ESIM 88.0
(17) HIM (ESIM+Syn.tree-LSTM) 88.6
• Enhanced Sequential Inference Model
(ESIM) achieves an accuracy of 88.0%, which
has already outperformed all the previous
models.
• Hybrid Inference Model (HIM), which en-
sembles our ESIM model with syntactic tree-
LSTMs [ZSG15] based on syntactic parse
trees, achieve additional improvement.
Table 2: Ablation performance of the models
Model Test
(17) HIM (ESIM + syn.tree) 88.6
(18) ESIM + tree 88.2
(16) ESIM 88.0
(19) ESIM - ave./max 87.1
(20) ESIM - diff./prod. 87.0
(21) ESIM - inference BiLSTM 87.3
(22) ESIM - encoding BiLSTM 86.3
(23) ESIM - P-based attention 87.2
(24) ESIM - H-based attention 86.5
(25) syn.tree 87.8
Training Speed: tree-LSTM takes about 40
hours on Nvidia-Tesla K40M and ESIM takes
about 6 hours.

Más contenido relacionado

Más de Association for Computational Linguistics

Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Association for Computational Linguistics
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Association for Computational Linguistics
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopAssociation for Computational Linguistics
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Association for Computational Linguistics
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Association for Computational Linguistics
 
Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...
Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...
Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...Association for Computational Linguistics
 
Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...
Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...
Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...Association for Computational Linguistics
 
Toshiaki Nakazawa - 2015 - Overview of the 2nd Workshop on Asian Translation
Toshiaki Nakazawa  - 2015 - Overview of the 2nd Workshop on Asian TranslationToshiaki Nakazawa  - 2015 - Overview of the 2nd Workshop on Asian Translation
Toshiaki Nakazawa - 2015 - Overview of the 2nd Workshop on Asian TranslationAssociation for Computational Linguistics
 
Hua Shan - 2015 - A Dependency-to-String Model for Chinese-Japanese SMT System
Hua Shan - 2015 - A Dependency-to-String Model for Chinese-Japanese SMT SystemHua Shan - 2015 - A Dependency-to-String Model for Chinese-Japanese SMT System
Hua Shan - 2015 - A Dependency-to-String Model for Chinese-Japanese SMT SystemAssociation for Computational Linguistics
 
Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...
Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...
Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...Association for Computational Linguistics
 
Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...
Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...
Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...Association for Computational Linguistics
 
Katsuhito Sudoh - 2015 Chinese-to-Japanese Patent Machine Translation based o...
Katsuhito Sudoh - 2015 Chinese-to-Japanese Patent Machine Translation based o...Katsuhito Sudoh - 2015 Chinese-to-Japanese Patent Machine Translation based o...
Katsuhito Sudoh - 2015 Chinese-to-Japanese Patent Machine Translation based o...Association for Computational Linguistics
 

Más de Association for Computational Linguistics (20)

Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
 
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
 
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 WorkshopSatoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
 
Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015Chenchen Ding - 2015 - NICT at WAT 2015
Chenchen Ding - 2015 - NICT at WAT 2015
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
 
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
 
Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...
Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...
Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...
 
Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...
Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...
Terumasa Ehara - 2015 - System Combination of RBMT plus SPE and Preordering p...
 
Toshiaki Nakazawa - 2015 - Overview of the 2nd Workshop on Asian Translation
Toshiaki Nakazawa  - 2015 - Overview of the 2nd Workshop on Asian TranslationToshiaki Nakazawa  - 2015 - Overview of the 2nd Workshop on Asian Translation
Toshiaki Nakazawa - 2015 - Overview of the 2nd Workshop on Asian Translation
 
Hua Shan - 2015 - A Dependency-to-String Model for Chinese-Japanese SMT System
Hua Shan - 2015 - A Dependency-to-String Model for Chinese-Japanese SMT SystemHua Shan - 2015 - A Dependency-to-String Model for Chinese-Japanese SMT System
Hua Shan - 2015 - A Dependency-to-String Model for Chinese-Japanese SMT System
 
Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...
Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...
Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...
 
Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...
Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...
Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...
 
Katsuhito Sudoh - 2015 Chinese-to-Japanese Patent Machine Translation based o...
Katsuhito Sudoh - 2015 Chinese-to-Japanese Patent Machine Translation based o...Katsuhito Sudoh - 2015 Chinese-to-Japanese Patent Machine Translation based o...
Katsuhito Sudoh - 2015 Chinese-to-Japanese Patent Machine Translation based o...
 

Último

IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 

Último (20)

IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 

Qian Chen - 2017 - Enhanced LSTM for Natural Language Inference

  • 1. Enhanced LSTM for Natural Language Inference Qian Chen1 , Xiaodan Zhu23 , Zhen-Hua Ling1 , Si Wei4 , Hui Jiang5 , Diana Inkpen6 1 University of Science and Technology of China 2 National Research Council Canada 3 Queen’s University 4 iFLYTEK Research 5 York University 6 University of Ottawa Contributions F Propose a hybrid neural network model for natural language inference. ? The model achieves the best results on the SNLI dataset. F Our first component, Enhanced Sequential Inference Model (ESIM), has outperformed the previous best results. F Further using tree-LSTM [Zhu, ICML- 2015, Tai, ACL-2015, Le, *SEM-2015] to en- code syntactic parses can improve the performance additionally. Source code available!!! https://github.com/lukecq1231/nli Our implementation uses python and is based on the Theano library. An example Natural language inference (NLI) aims to determine whether a natural-language hypothesis H can be inferred from a premise P. • Premise: A woman wearing a black dress and green sweater is walking down the street looking at her cell- phone. • Hypothesis 1: A woman is holding her cell phone. (Entailment) • Hypothesis 2: A woman is looking at a text on her cell phone. (Neutral) • Hypothesis 3: A woman has her cell phone up to her ear. (Contradiction) Analysis Figure 1: Attention visualization of stand-alone syntactic tree-LSTM model Hybrid Neural Inference Model Figure 2: A high-level view of our Enhanced Sequential Inference Model (ESIM) 1. Input Encoding Premise: xp 1, xp 2, . . . , xp N Hypothesis: xh 1 , xh 2 , . . . , xh M Embedding matrix: E 2 RV ⇥Dm hp = Enc(E(xp 1), . . . , E(xp N )) 2 RN⇥De (1) hh = Enc(E(xh 1 ), . . . , E(xh M )) 2 RM⇥De (2) where Enc is BiLSTM or Tree-LSTM [ZSG15, TSM15]. Here Enc learns to represent a word (or phrase) and its context. 2. Local Inference Modeling • Local inference collected eij = (hp i )T hh j , e 2 RN⇥M (3) ¯hp i = X j exp(eij) P k exp(eik) hh j , ¯hp 2 RN⇥De (4) ¯hh j = X i exp(eij) P k exp(ekj) hp i , ¯hh 2 RM⇥De (5) Figure 3: A high-level view of our Hybrid Inference Model (HIM) Intuitively, the content in hh that is relevant to hp i will be selected and represented as ¯hp i , and vice versa. • Enhancement of local inference information mp = [hp ; ¯hp; hp ¯hp; hp ¯hp] 2 RN⇥4De (6) mh = [hh ; ¯hh; hh ¯hh; hh ¯hh] 2 RM⇥4De (7) 3. Inference Composition vp = Cmp(mp 1, . . . , mp N ) 2 RN⇥Dc (8) vh = Cmp(mh 1 , . . . , mg M ) 2 RM⇥Dc (9) v = [max(vp ); ave(vp ); max(vh ); ave(vh )] 2 R4Dc (10) where Cmp is BiLSTM or Tree-LSTM. Finally, we put v into a final MLP classifier. Results • Data: Stanford Natural Language Inference (SNLI) (Training: 550k sentence pairs, held- out: 10k, testing: 10k) Table 1: Accuracies of the models on SNLI Model Test (1) Handcrafted features [BAPM15] 78.2 (2) LSTM [BGR+ 16] 80.6 (3) GRU [VKFU15] 81.4 (4) Tree CNN [MML+ 16] 82.1 (5) SPINN-PI [BGR+ 16] 83.2 (6) BiLSTM intra-Att [LSLW16] 84.2 (7) NSE [MY16a] 84.6 (8) Att-LSTM [RGH+ 15] 83.5 (9) mLSTM [WJ16] 86.1 (10) LSTMN [CDL16] 86.3 (11) Decomposable Att [PTDU16] 86.3 (12) Intra-sent Att+(11) [PTDU16] 86.8 (13) NTI-SLSTM-LSTM [MY16b] 87.3 (14) Re-read LSTM [SCSL16] 87.5 (15) btree-LSTM [PAD+ 16] 87.6 (16) ESIM 88.0 (17) HIM (ESIM+Syn.tree-LSTM) 88.6 • Enhanced Sequential Inference Model (ESIM) achieves an accuracy of 88.0%, which has already outperformed all the previous models. • Hybrid Inference Model (HIM), which en- sembles our ESIM model with syntactic tree- LSTMs [ZSG15] based on syntactic parse trees, achieve additional improvement. Table 2: Ablation performance of the models Model Test (17) HIM (ESIM + syn.tree) 88.6 (18) ESIM + tree 88.2 (16) ESIM 88.0 (19) ESIM - ave./max 87.1 (20) ESIM - diff./prod. 87.0 (21) ESIM - inference BiLSTM 87.3 (22) ESIM - encoding BiLSTM 86.3 (23) ESIM - P-based attention 87.2 (24) ESIM - H-based attention 86.5 (25) syn.tree 87.8 Training Speed: tree-LSTM takes about 40 hours on Nvidia-Tesla K40M and ESIM takes about 6 hours.