SlideShare una empresa de Scribd logo
1 de 16
EMNLP 2016 reading
Incorporating Discrete Translation Lexicons
into Neural Machine Translation
author : Philip Arthur
Graham Neubig, Satoshi Nakamura
presentation : Sekizawa Yuuki
Komachi lab M1
17/02/15 1
Incorporating Discrete Translation Lexicons
into Neural Machine Translation
• NMT often mistakes traislating
low-frequency content words
• lose sentence meaning
• propose method
• encode low-frequency words by lexicon probabilicity
• 2methods : 1, use it as a bias 2, linear interpolation
• result (En-Ja translation, use two corpora (KFTT, BTEC) )
• improve 2.0-2.3 BLEU, 0.13-0.44 NIST score
• faster covergence time
17/02/15 2
NMT feature
• NMT system
• treat each word in the vocabulary as a vector of continuous-
valued numbers
• share statistical power between similar words
(“dog” and “cat”) or contexts (“this is” and “that is”)
• drawback : often mistranslate into words that seem natural in the
context
 do not reflect the content of the source sentence.
• PBMT・SMT tend to rarely make this kind of mistake
• base their translations on discrete phrase mappings
• ensure that source words will be translated into a target word that
has been observed as a translation at least once in the training data
17/02/15 3
NMT
• source words
• target words
• translate probability
17/02/15 4
weight matrix bias vector
fixed-width vector
Integrating Lexicons into NMT
• Lexicon probability
17/02/15 5
lexical matrix
by input sentence
alignment
probability
v
o
c
a
b
input sentence words
combine lexicon probability
1. model bias
1. linear interpolation
17/02/15 6
x : learnable
parameter
(begin : 0.5)
prevent zero
probability
here : 0.001
Constructing Lexicon Probability
1. automatically learning
• use EM algorithm
• E : count expected count :
• M : lexicon probability 
2. manual
• use dictionary entry
as translation
3. hybrid
17/02/15 7
all possible count
translation set
of source word f
Experiment
• Dataset : KFTT, BTEC
• English to Japanese
• tokenize, lowercase
• length <= 50
• if low frequent word,
it replace <unk> and translate in test (Luong et al (2015) )
• BTEC : less than 1, KFTT : less than 3
• Evaluation
• BLEU, NIST, recall (rare words from references)
17/02/15 8
Data Corpu
s
Sentence Tokens
En Ja
Train BTEC
KFTT
464K
377K
3.60M 4.97M
7.77M 8.04M
Dev BTEC
KFTT
510
1,160
3.8K 5.3K
24.3K
26.8K
Test BTEC
KFTT
508
1,169
3.8K 5.5K
26.0K
28.4K
appear less than 8 times in
target training corpus or references
vocab-size source target
BTEC 17.8k 21.8k
KFTT 48.2k 49.1k
Experiment
• method
• pbmt : Koehn+ (2003) – use Moses
• hiero (hierarchical pbmt) : Chiang+ (2007) – use travatar
• attn : Bahdanau+ (2015) – attention NMT
• auto-bias : proposed – automatic
• hyb-bias : proposed – hybrid dictionary
• Lexicon
• auto : training data (separately) with GIZA++
• manual : English-Japanese dictionary – Eijiro : 104k entries
• hyb : combine “auto” and “manual” lexicon
17/02/15 9
compare with related work
† : p < 0.05, * : p < 0.10
17/02/15 10
+2.3 +0.44 +30%
compare with related work
† : p < 0.05, * : p < 0.10
• KFTT : BLEU↑ NIST↓ (compare with SMT)
• traditional SMT systems have a small advantage
in translating low-frequency words
17/02/15 11
Translate examples
17/02/15 12
Training curves
• in KFTT
• blue : attn
• orange : auto-bias
• green : hyb-bias
• first iteration : propose BLEU are higher than attn
• iteration time : 167minutes (attn) 275minutes (auto-bias)
• due to calculate and use lexical probability matrix
17/02/15 13
Attention matrices
• proposed (bias)
• more correct
• lighter color : stronger word attention
• red box : correct alignment
17/02/15 14
proposed method result
first column
without lexicon NMT
bias
・man is less effective
due to coverage for target
domain words
linear
・reverse to bias
・worse than bias
due to constant
interpolation coefficient
17/02/15 15
Incorporating Discrete Translation Lexicons
into Neural Machine Translation
• NMT often mistakes traislating
low-frequency content words
• propose method
• encode low-frequency words by lexicon probabilicity
• 2methods : 1, use it as a bias 2, linear interpolation
• improve 2.0-2.3 BLEU, 0.13-0.44 NIST score
• faster covergence time
17/02/15 16

Más contenido relacionado

Destacado (7)

paper introducing: Exploiting source side monolingual data in neural machine ...
paper introducing: Exploiting source side monolingual data in neural machine ...paper introducing: Exploiting source side monolingual data in neural machine ...
paper introducing: Exploiting source side monolingual data in neural machine ...
 
Emnlp読み会@2015 10-09
Emnlp読み会@2015 10-09Emnlp読み会@2015 10-09
Emnlp読み会@2015 10-09
 
Coling2016 pre-translation for neural machine translation
Coling2016 pre-translation for neural machine translationColing2016 pre-translation for neural machine translation
Coling2016 pre-translation for neural machine translation
 
Nlp2016 sekizawa
Nlp2016 sekizawaNlp2016 sekizawa
Nlp2016 sekizawa
 
[論文紹介]Selecting syntactic, non redundant segments in active learning for mach...
[論文紹介]Selecting syntactic, non redundant segments in active learning for mach...[論文紹介]Selecting syntactic, non redundant segments in active learning for mach...
[論文紹介]Selecting syntactic, non redundant segments in active learning for mach...
 
Acl reading@2016 10-26
Acl reading@2016 10-26Acl reading@2016 10-26
Acl reading@2016 10-26
 
目的言語の低頻度語の高頻度語への言い換えによるニューラル機械翻訳の改善
目的言語の低頻度語の高頻度語への言い換えによるニューラル機械翻訳の改善目的言語の低頻度語の高頻度語への言い換えによるニューラル機械翻訳の改善
目的言語の低頻度語の高頻度語への言い換えによるニューラル機械翻訳の改善
 

Similar a Emnlp読み会@2017 02-15

Joint Copying and Restricted Generation for Paraphrase
Joint Copying and Restricted Generation for ParaphraseJoint Copying and Restricted Generation for Paraphrase
Joint Copying and Restricted Generation for Paraphrase
Masahiro Kaneko
 
Translating phrases in neural machine translation
Translating phrases in neural machine translationTranslating phrases in neural machine translation
Translating phrases in neural machine translation
sekizawayuuki
 

Similar a Emnlp読み会@2017 02-15 (20)

An Introduction to Pre-training General Language Representations
An Introduction to Pre-training General Language RepresentationsAn Introduction to Pre-training General Language Representations
An Introduction to Pre-training General Language Representations
 
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATIONAN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
AN ADVANCED APPROACH FOR RULE BASED ENGLISH TO BENGALI MACHINE TRANSLATION
 
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
 
Joint Copying and Restricted Generation for Paraphrase
Joint Copying and Restricted Generation for ParaphraseJoint Copying and Restricted Generation for Paraphrase
Joint Copying and Restricted Generation for Paraphrase
 
Summary of English Japanese Translation by MSR-MT
Summary of English Japanese Translation by MSR-MTSummary of English Japanese Translation by MSR-MT
Summary of English Japanese Translation by MSR-MT
 
GPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask LearnersGPT-2: Language Models are Unsupervised Multitask Learners
GPT-2: Language Models are Unsupervised Multitask Learners
 
A Neural Grammatical Error Correction built on Better Pre-training and Sequen...
A Neural Grammatical Error Correction built on Better Pre-training and Sequen...A Neural Grammatical Error Correction built on Better Pre-training and Sequen...
A Neural Grammatical Error Correction built on Better Pre-training and Sequen...
 
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
Learning to Generate Pseudo-code from Source Code using Statistical Machine T...
 
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
 
fujii22apsipa_asc
fujii22apsipa_ascfujii22apsipa_asc
fujii22apsipa_asc
 
Improving Japanese-to-English Neural Machine Translation by Paraphrasing the ...
Improving Japanese-to-English Neural Machine Translation by Paraphrasing the ...Improving Japanese-to-English Neural Machine Translation by Paraphrasing the ...
Improving Japanese-to-English Neural Machine Translation by Paraphrasing the ...
 
Translating phrases in neural machine translation
Translating phrases in neural machine translationTranslating phrases in neural machine translation
Translating phrases in neural machine translation
 
Junki Matsuo - 2015 - Source Phrase Segmentation and Translation for Japanese...
Junki Matsuo - 2015 - Source Phrase Segmentation and Translation for Japanese...Junki Matsuo - 2015 - Source Phrase Segmentation and Translation for Japanese...
Junki Matsuo - 2015 - Source Phrase Segmentation and Translation for Japanese...
 
nakai22apsipa_presentation.pdf
nakai22apsipa_presentation.pdfnakai22apsipa_presentation.pdf
nakai22apsipa_presentation.pdf
 
TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...
TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...
TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...
 
111111111111111111111111111111111789.ppt
111111111111111111111111111111111789.ppt111111111111111111111111111111111789.ppt
111111111111111111111111111111111789.ppt
 
111111111111111111111111111111111789.ppt
111111111111111111111111111111111789.ppt111111111111111111111111111111111789.ppt
111111111111111111111111111111111789.ppt
 
Deep Learning for Machine Translation
Deep Learning for Machine TranslationDeep Learning for Machine Translation
Deep Learning for Machine Translation
 
Hybrid Machine Translation by Combining Multiple Machine Translation Systems
Hybrid Machine Translation by Combining Multiple Machine Translation SystemsHybrid Machine Translation by Combining Multiple Machine Translation Systems
Hybrid Machine Translation by Combining Multiple Machine Translation Systems
 

Más de sekizawayuuki

Improving lexical choice in neural machine translation
Improving lexical choice in neural machine translationImproving lexical choice in neural machine translation
Improving lexical choice in neural machine translation
sekizawayuuki
 
Incorporating word reordering knowledge into attention-based neural machine t...
Incorporating word reordering knowledge into attention-based neural machine t...Incorporating word reordering knowledge into attention-based neural machine t...
Incorporating word reordering knowledge into attention-based neural machine t...
sekizawayuuki
 
読解支援@2015 07-13
読解支援@2015 07-13読解支援@2015 07-13
読解支援@2015 07-13
sekizawayuuki
 
読解支援@2015 07-03
読解支援@2015 07-03読解支援@2015 07-03
読解支援@2015 07-03
sekizawayuuki
 
読解支援@2015 06-26
読解支援@2015 06-26読解支援@2015 06-26
読解支援@2015 06-26
sekizawayuuki
 
読解支援@2015 06-12
読解支援@2015 06-12読解支援@2015 06-12
読解支援@2015 06-12
sekizawayuuki
 
読解支援@2015 06-09
読解支援@2015 06-09読解支援@2015 06-09
読解支援@2015 06-09
sekizawayuuki
 
読解支援@2015 06-05
読解支援@2015 06-05読解支援@2015 06-05
読解支援@2015 06-05
sekizawayuuki
 
読解支援@2015 05-22
読解支援@2015 05-22読解支援@2015 05-22
読解支援@2015 05-22
sekizawayuuki
 
読解支援@2015 05-15
読解支援@2015 05-15読解支援@2015 05-15
読解支援@2015 05-15
sekizawayuuki
 

Más de sekizawayuuki (20)

Improving lexical choice in neural machine translation
Improving lexical choice in neural machine translationImproving lexical choice in neural machine translation
Improving lexical choice in neural machine translation
 
Incorporating word reordering knowledge into attention-based neural machine t...
Incorporating word reordering knowledge into attention-based neural machine t...Incorporating word reordering knowledge into attention-based neural machine t...
Incorporating word reordering knowledge into attention-based neural machine t...
 
Acl読み会@2015 09-18
Acl読み会@2015 09-18Acl読み会@2015 09-18
Acl読み会@2015 09-18
 
読解支援@2015 08-10-6
読解支援@2015 08-10-6読解支援@2015 08-10-6
読解支援@2015 08-10-6
 
読解支援@2015 08-10-5
読解支援@2015 08-10-5読解支援@2015 08-10-5
読解支援@2015 08-10-5
 
読解支援@2015 08-10-4
読解支援@2015 08-10-4読解支援@2015 08-10-4
読解支援@2015 08-10-4
 
読解支援@2015 08-10-3
読解支援@2015 08-10-3読解支援@2015 08-10-3
読解支援@2015 08-10-3
 
読解支援@2015 08-10-2
読解支援@2015 08-10-2読解支援@2015 08-10-2
読解支援@2015 08-10-2
 
読解支援@2015 08-10-1
読解支援@2015 08-10-1読解支援@2015 08-10-1
読解支援@2015 08-10-1
 
読解支援@2015 07-24
読解支援@2015 07-24読解支援@2015 07-24
読解支援@2015 07-24
 
読解支援@2015 07-17
読解支援@2015 07-17読解支援@2015 07-17
読解支援@2015 07-17
 
読解支援@2015 07-13
読解支援@2015 07-13読解支援@2015 07-13
読解支援@2015 07-13
 
読解支援@2015 07-03
読解支援@2015 07-03読解支援@2015 07-03
読解支援@2015 07-03
 
読解支援@2015 06-26
読解支援@2015 06-26読解支援@2015 06-26
読解支援@2015 06-26
 
Naacl読み会@2015 06-24
Naacl読み会@2015 06-24Naacl読み会@2015 06-24
Naacl読み会@2015 06-24
 
読解支援@2015 06-12
読解支援@2015 06-12読解支援@2015 06-12
読解支援@2015 06-12
 
読解支援@2015 06-09
読解支援@2015 06-09読解支援@2015 06-09
読解支援@2015 06-09
 
読解支援@2015 06-05
読解支援@2015 06-05読解支援@2015 06-05
読解支援@2015 06-05
 
読解支援@2015 05-22
読解支援@2015 05-22読解支援@2015 05-22
読解支援@2015 05-22
 
読解支援@2015 05-15
読解支援@2015 05-15読解支援@2015 05-15
読解支援@2015 05-15
 

Último

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Último (20)

Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 

Emnlp読み会@2017 02-15

  • 1. EMNLP 2016 reading Incorporating Discrete Translation Lexicons into Neural Machine Translation author : Philip Arthur Graham Neubig, Satoshi Nakamura presentation : Sekizawa Yuuki Komachi lab M1 17/02/15 1
  • 2. Incorporating Discrete Translation Lexicons into Neural Machine Translation • NMT often mistakes traislating low-frequency content words • lose sentence meaning • propose method • encode low-frequency words by lexicon probabilicity • 2methods : 1, use it as a bias 2, linear interpolation • result (En-Ja translation, use two corpora (KFTT, BTEC) ) • improve 2.0-2.3 BLEU, 0.13-0.44 NIST score • faster covergence time 17/02/15 2
  • 3. NMT feature • NMT system • treat each word in the vocabulary as a vector of continuous- valued numbers • share statistical power between similar words (“dog” and “cat”) or contexts (“this is” and “that is”) • drawback : often mistranslate into words that seem natural in the context  do not reflect the content of the source sentence. • PBMT・SMT tend to rarely make this kind of mistake • base their translations on discrete phrase mappings • ensure that source words will be translated into a target word that has been observed as a translation at least once in the training data 17/02/15 3
  • 4. NMT • source words • target words • translate probability 17/02/15 4 weight matrix bias vector fixed-width vector
  • 5. Integrating Lexicons into NMT • Lexicon probability 17/02/15 5 lexical matrix by input sentence alignment probability v o c a b input sentence words
  • 6. combine lexicon probability 1. model bias 1. linear interpolation 17/02/15 6 x : learnable parameter (begin : 0.5) prevent zero probability here : 0.001
  • 7. Constructing Lexicon Probability 1. automatically learning • use EM algorithm • E : count expected count : • M : lexicon probability  2. manual • use dictionary entry as translation 3. hybrid 17/02/15 7 all possible count translation set of source word f
  • 8. Experiment • Dataset : KFTT, BTEC • English to Japanese • tokenize, lowercase • length <= 50 • if low frequent word, it replace <unk> and translate in test (Luong et al (2015) ) • BTEC : less than 1, KFTT : less than 3 • Evaluation • BLEU, NIST, recall (rare words from references) 17/02/15 8 Data Corpu s Sentence Tokens En Ja Train BTEC KFTT 464K 377K 3.60M 4.97M 7.77M 8.04M Dev BTEC KFTT 510 1,160 3.8K 5.3K 24.3K 26.8K Test BTEC KFTT 508 1,169 3.8K 5.5K 26.0K 28.4K appear less than 8 times in target training corpus or references vocab-size source target BTEC 17.8k 21.8k KFTT 48.2k 49.1k
  • 9. Experiment • method • pbmt : Koehn+ (2003) – use Moses • hiero (hierarchical pbmt) : Chiang+ (2007) – use travatar • attn : Bahdanau+ (2015) – attention NMT • auto-bias : proposed – automatic • hyb-bias : proposed – hybrid dictionary • Lexicon • auto : training data (separately) with GIZA++ • manual : English-Japanese dictionary – Eijiro : 104k entries • hyb : combine “auto” and “manual” lexicon 17/02/15 9
  • 10. compare with related work † : p < 0.05, * : p < 0.10 17/02/15 10 +2.3 +0.44 +30%
  • 11. compare with related work † : p < 0.05, * : p < 0.10 • KFTT : BLEU↑ NIST↓ (compare with SMT) • traditional SMT systems have a small advantage in translating low-frequency words 17/02/15 11
  • 13. Training curves • in KFTT • blue : attn • orange : auto-bias • green : hyb-bias • first iteration : propose BLEU are higher than attn • iteration time : 167minutes (attn) 275minutes (auto-bias) • due to calculate and use lexical probability matrix 17/02/15 13
  • 14. Attention matrices • proposed (bias) • more correct • lighter color : stronger word attention • red box : correct alignment 17/02/15 14
  • 15. proposed method result first column without lexicon NMT bias ・man is less effective due to coverage for target domain words linear ・reverse to bias ・worse than bias due to constant interpolation coefficient 17/02/15 15
  • 16. Incorporating Discrete Translation Lexicons into Neural Machine Translation • NMT often mistakes traislating low-frequency content words • propose method • encode low-frequency words by lexicon probabilicity • 2methods : 1, use it as a bias 2, linear interpolation • improve 2.0-2.3 BLEU, 0.13-0.44 NIST score • faster covergence time 17/02/15 16