SlideShare una empresa de Scribd logo
1 de 39
Descargar para leer sin conexión
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 1
Tree-based Translation Models
Yusuke Oda
@odashi_t
2014/6/5 NAIST MT-Study Group
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 2
Agenda
●
(6.2) Synchronous Context Free Grammar (SCFG)
– (6.2.2) Learning SCFG
– (6.2.3) Introducing Syntax Labels
– (6.2.4) Features
– (6.2.5) Decoding
– (6.2.6) Rescoring
●
(6.3) Synchronous Tree Substitution Grammar (STSG)
– (6.3.1) Characteristics of STSG
– (6.3.2) Learning STSG
– (6.3.3) Features
– (6.4.4) Decoding
– (6.3.5) Binarization
●
(6.4) Synchronous Parsing
– (6.4.1) Inversion Transduction Grammar (ITG)
– (6.4.2) Span Pruning
– (6.4.3) Beam Search
– (6.4.4) Two Parsing
Hiero
Travatar
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 3
Synchronous Context Free Grammar
(SCFG)
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 4
Learning SCFG
●
Synchronous rules are retrieved from each parallel corpora and their
word alignment .
●
: Source sentence
●
: Target sentence
●
: Set of word alignment
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 5
Closed Phrase Pair under Word Alignment
●
A phrase pair is closed under its word alignment
●
Phrase pair and alignment satisfy below:
he
will
dissolve
the
diet
in
the
near
future 彼
は
近い
うち
に
国会
を
解散
する
(国会 を → the diet)
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 6
Extracting Abstract Rules
●
We can make more abstract synchronous rules by replacing some words
in a phrase pair into a non-terminal symbol, when the phrase pair
covers other "small" phrase pair.
dissolve
the
diet
in
the
near
future
近い
うち
に
国会
を
解散
する
dissolve
in
the
に
解散
する
(国会 を, the diet)
(近い うち, near future)
(近い うち ... 解散 する, dissolve the ... near future)
(X1 に X2 解散 する, dissolve X2 in the X1)
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 7
Hiero Grammar
●
Hierarchical phrase grammar (Hiero Grammar):
– Set of all synchronous rule in the parallel corpus
●
Algorithm:
1.
where is the set of all possible phrase pair in the parallel corpora.
2. If a rule and a phrase pair satisfies then
3.
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 8
Constraints of Hiero Rules
●
To suppress size and ambiguity of Hiero grammar, we can introduce
some constraints for rule extraction.
●
Minimal phrase pair
– (国会 を, the diet) ... BAD
– (国会, the diet) ... GOOD
●
Phrase length
– (奈良 先端 科学 技術 大学院 大学 情報 科学 研究 科 自然 言語 処理 学 研究 室, ...) BAD (too many words)
●
Number of symbol
– X → 〈あらゆる X1 を 全て X2 の 方 へ ねじ曲げ た の だ, ...〉 BAD (too many symbols)
●
Rank of rule
– X → 〈X1 が X2 で X3 に X4 した, ...〉 BAD (too many non-terminals)
the
diet
国会
を
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 9
Glue Rules
●
To make large size sentence using small rules, we introduce glue rules
as below:
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 10
Introducing Syntax Labels
●
Up to here, we considered basic ideas of Hiero rules.
– non-terminal symbol are only and .
●
This model is very simple, but very ambiguous.
●
Next, we introduce syntax information into Hiero rules.
= Syntax-augmented machine translation (SAMT)
S
NP VP
PRP VBZ DT NN
this is a pen
NP
Hiero Syntax
+ → SAMT
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 11
Combinatorial Categorical Grammar (CCG)
●
SAMT uses categories (≒partial structure of syntax label) based on
the idea of combinatorial categorical grammar (CCG) .
●
Categories:
: Syntax label with absence of right-side child
: Syntax label with absence of left-side child
: Concatenation of two syntax labels and
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 12
Extracting SAMT Rules
dissolve
the
diet
in
the
near
future
近い
うち
に
国会
を
解散
する
VP
VB
NP
PP
DT
NNP
IN
NP
DT
JJ
NN
NP
NP
PP
VP
NPDT
IN+DT
VP/PP
VPVB
VP → 〈NPDT1 に NP2 解散 する, dissolve NP2 in the NPDT1〉
VP → 〈近い うち IN+DT1 国会 を VB2, VB2 the diet IN+DT1 near future〉
etc...
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 13
Probabilistic Formalization of Hiero Model
●
We consider that the translation problem using Hiero grammar is
maximization of posterior probability (similar to phrase based model):
●
And we assume the probability is modeled as log-linear model:
: Set of derivation (≒ set of used synchronous rules)
: Weights
: Feature functions
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 14
Features of Hiero Model (1)
●
Generative model: likelihoods of translation probability
Forward model:
Backward model:
where
Forward
Backward
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 15
Features of Hiero Model (2)
●
Generative model: likelihoods of translation probability
Syntax model (f):
Syntax model (e):
where
Syntax (f) Syntax (e)
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 16
Features of Hiero Model (3)
●
Lexical translation model: goodness of phrase alignment
Forward model:
Backward model:
where
Forward
Backward
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 17
Features of Hiero Model (4)
●
Language model: measuring fluency of hypothesis
Out-of-vocabulary (OOV) penalty: adjusting LM
●
Length penalty: adjusting number of words in hypothesis
Glueing penalty: adjusting number of glue rules in derivation
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 18
Decoding of Hiero Model
●
Now input sentence and set of SCFG rules are given, we find
the optimal output sequence :
: Set of possible derivation given a grammar
: Sequence of terminal symbols in given derivationn
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 19
Decoding Process
1. Calculate intersection between and .
•
= Generating syntax forest using CYK algorithm
2. Transform syntax forest into corresponding translation forest .
3. Output the sequence of terminal symbols in that maximizes model
score.
S
NP VP
PP NP V
NP P NP
が
犬
本
の
上に
座った
S
NP VP
the dog V NP PP
sat NP of P NP
the upper on the book
"犬 が 本 の 上 に 座った"
"the dog sat on the book"
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 20
Synchronous Tree Substitution Grammar
(STSG)
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 21
Synchronous Tree Substitution Grammar
●
STSG is a extension of Tree Substitution Grammar (TSG) for bilingual
analysis.
●
STSG is a subset of Synchronous Tree Adjoining Grammar (STAG).
●
Definition:
SCFG (Hiero)
STSG
STAG
U
U
Set of non-terminal symbol
Start symbol
Set of terminal symbol
Set of rules
Weight semiring
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 22
Synchronous Rules of STSG
●
Definition:
where : Elementary tree (source language)
: Elementary tree (target language)
: Association between and
●
All rules are also associated a weight:
S
x1:NP VP
x2:NP V
開けた
S
x1:NP VP
VBD x2:NP
opened
frontier
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 23
Expressive Power of STSG
●
SCFG cannot express the difference of syntax, but STSG can treat it.
●
Example:
– This synchronous rule cannot generate using more smaller SCFG rules
because these trees not corresponds any structure.
– STSG framework can treat these correspondence of tree structure directly.
NP
NP PP
N P x1:CD PC
犬 が 匹
NP
NNSx1:CD
dogs
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 24
Translation Models under STSG Framework
●
In the STSG framework, we can use the sequence of frontier nodes
(leaves of synchronous rule) instead of full tree.
●
4 translation models are available when we choose either tree or
sequence of frontier as data structure about source and target
language.
Target : frontier Target : tree
Source : frontier
String-to-string
translation
(= SCFG)
String-to-tree
translation
Source : tree
Tree-to-string
translation
Tree-to-tree
translation
S
x1:NP
VP
x2:NP
V
開けた
Tree
sequence of frontier nodes
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 25
Retrieving STSG Synchronous Rules
●
Heuristic method (similar to SCFG rule extraction)
: Syntax tree generated
from source sentence
: Syntax tree generated
from target sentence
dissolve
the
diet
in
the
near
future
近い
うち
に
国会
を
解散
する
VP VB
NP
PP
DT
NNP
IN
NP
DT
JJ
NN
VP
PP NP VP
N P
NP
NP VP P
VP
x1:PP x2:NP VP
V P
解散 する
VP
x1:PPx2:NPVB
dissolve
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 26
GHKM Algorithm
●
Galley-Hopkins-Kinght-Marcu (GHKM) Algorithm
– Generating STSG synchronous rules (string-to-tree rules) by composing minimal
rules using inside-outside algorithm.
Minimal rule
Syntax tree
1.
Detecting minimal rules
from target syntax trees.
2.
Generating large synchronous
rules by composing minimal
rules.
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 27
GHKM: Alignment Span (1)
●
Alignment span :
– Set of indexes of words in source sentence aligned to partial tree
●
Complement alignment span :
– Set of indexes of words in source sentence aligned to other than
●
Closure :
– Minimum range that covers the alignment span
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 28
GHKM: Alignment Span (2)
he
will
dissolve
the
diet
in
the
near
future
彼
は
近い
うち
に
国会
を
解散
する
VP VB
NP
PP
DT
NNP
IN
NP
DT
JJ
NN
S
NP
PRP
MD
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 29
GHKM: Admissible Node
●
Admissible node:
– Node in target syntax tree that satisfies:
he
will
dissolve
the
diet
in
the
near
future
VP VB
NP
PP
DT
NNP
IN
NP
DT
JJ
NN
S
NP
PRP
MD
彼
は
近い
うち
に
国会
を
解散
する
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 30
GHKM: Minimal Rule
●
Split the syntax tree by admissible node
he
will
dissolve
the
diet
in
the
near
future
VP VB
NP
PP
DT
NNP
IN
NP
DT
JJ
NN
S
NP
PRP
MD
彼
は
近い
うち
に
国会
を
解散
する
VP
x1:PP x2:NP x3:VB
x
x3 x2 x1
VP
the near future
x
近い うち
DT JJ NN
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 31
Extension for Tree-to-tree Model (1)
●
We need to extract node pairs of two syntax trees that are admissible
each other.
●
First, find admissible nodes in given .
●
A node pair satisfies below then they are
bidirectional admissible:
●
Span :
– Minimum range over sentence that covers all terminal symbols in
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 32
Extension for Tree-to-tree Model (2)
dissolve
the
diet
in
the
near
future
近い
うち
に
国会
を
解散
する
VP VB
NP
PP
DT
NNP
IN
NP
DT
JJ
NN
VP
PP NP VP
N P
NP
NP VP P
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 33
Features of STSG Model (1)
●
Generative model: likelihoods of translation probability
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 34
Features of STSG Model (2)
●
Lexical translation model: goodness of phrase alignment
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 35
Features of STSG Model (3)
●
Height penalty: adjusting depth of derivation
●
Internal node penalty: adjusting total size of derivation
●
Some features introduced to Hiero model are also available
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 36
Decoding of STSG Model
●
STSG decoding is basically same method as Hiero decoding:
Depends on translation model
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 37
Difference of Formalization of Each Model
●
String-to-string model
– Same model as Hiero (SCFG) model.
●
String-to-tree model
– Never use any informations from syntax of source sentence.
●
Tree-to-string model
●
Tree-to-tree model
– Explicitly use syntax informations of source sentence.
– Translation process can be divided into syntax analysis and decoding.
Source
sentence
Syntax tree
of source sentence
Translation
hypothesi(e)s
Syntax
analyzer Decoder
Non-syntax-based
translation
Syntax(tree)-based
translation
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 38
Formalization of Syntax-based Translation
●
Syntax-based translation model uses the syntax tree of source
sentence.
●
We can ignore because is already decided while syntax
analysis.
14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 39
Questions & Discussions

Más contenido relacionado

La actualidad más candente

2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in Kotlin2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in KotlinGiovanni Ciatto
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinaloscon2007
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyTravis Oliphant
 
Timed Colored Perti Nets
Timed Colored Perti NetsTimed Colored Perti Nets
Timed Colored Perti Netssandra sukarieh
 

La actualidad más candente (6)

2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in Kotlin2P-Kt: logic programming with objects & functions in Kotlin
2P-Kt: logic programming with objects & functions in Kotlin
 
Os Reindersfinal
Os ReindersfinalOs Reindersfinal
Os Reindersfinal
 
Java Semantics
Java SemanticsJava Semantics
Java Semantics
 
NLP in 2020
NLP in 2020NLP in 2020
NLP in 2020
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
 
Timed Colored Perti Nets
Timed Colored Perti NetsTimed Colored Perti Nets
Timed Colored Perti Nets
 

Destacado

Pattern Recognition and Machine Learning: Section 3.3
Pattern Recognition and Machine Learning: Section 3.3Pattern Recognition and Machine Learning: Section 3.3
Pattern Recognition and Machine Learning: Section 3.3Yusuke Oda
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .pptbutest
 
翻訳精度の最大化による同時音声翻訳のための文分割法 (NLP2014)
翻訳精度の最大化による同時音声翻訳のための文分割法 (NLP2014)翻訳精度の最大化による同時音声翻訳のための文分割法 (NLP2014)
翻訳精度の最大化による同時音声翻訳のための文分割法 (NLP2014)Yusuke Oda
 
ACL Reading @NAIST: Fast and Robust Neural Network Joint Model for Statistica...
ACL Reading @NAIST: Fast and Robust Neural Network Joint Model for Statistica...ACL Reading @NAIST: Fast and Robust Neural Network Joint Model for Statistica...
ACL Reading @NAIST: Fast and Robust Neural Network Joint Model for Statistica...Yusuke Oda
 
複数の事前並べ替え候補を用いた句に基づく統計的機械翻訳
複数の事前並べ替え候補を用いた句に基づく統計的機械翻訳複数の事前並べ替え候補を用いた句に基づく統計的機械翻訳
複数の事前並べ替え候補を用いた句に基づく統計的機械翻訳Yusuke Oda
 
Part 2: Unsupervised Learning Machine Learning Techniques
Part 2: Unsupervised Learning Machine Learning Techniques Part 2: Unsupervised Learning Machine Learning Techniques
Part 2: Unsupervised Learning Machine Learning Techniques butest
 
ChainerによるRNN翻訳モデルの実装+@
ChainerによるRNN翻訳モデルの実装+@ChainerによるRNN翻訳モデルの実装+@
ChainerによるRNN翻訳モデルの実装+@Yusuke Oda
 
Encoder-decoder 翻訳 (TISハンズオン資料)
Encoder-decoder 翻訳 (TISハンズオン資料)Encoder-decoder 翻訳 (TISハンズオン資料)
Encoder-decoder 翻訳 (TISハンズオン資料)Yusuke Oda
 
最近のDeep Learning (NLP) 界隈におけるAttention事情
最近のDeep Learning (NLP) 界隈におけるAttention事情最近のDeep Learning (NLP) 界隈におけるAttention事情
最近のDeep Learning (NLP) 界隈におけるAttention事情Yuta Kikuchi
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...Sebastian Raschka
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningLars Marius Garshol
 

Destacado (13)

Pattern Recognition and Machine Learning: Section 3.3
Pattern Recognition and Machine Learning: Section 3.3Pattern Recognition and Machine Learning: Section 3.3
Pattern Recognition and Machine Learning: Section 3.3
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
 
翻訳精度の最大化による同時音声翻訳のための文分割法 (NLP2014)
翻訳精度の最大化による同時音声翻訳のための文分割法 (NLP2014)翻訳精度の最大化による同時音声翻訳のための文分割法 (NLP2014)
翻訳精度の最大化による同時音声翻訳のための文分割法 (NLP2014)
 
Test
TestTest
Test
 
ACL Reading @NAIST: Fast and Robust Neural Network Joint Model for Statistica...
ACL Reading @NAIST: Fast and Robust Neural Network Joint Model for Statistica...ACL Reading @NAIST: Fast and Robust Neural Network Joint Model for Statistica...
ACL Reading @NAIST: Fast and Robust Neural Network Joint Model for Statistica...
 
複数の事前並べ替え候補を用いた句に基づく統計的機械翻訳
複数の事前並べ替え候補を用いた句に基づく統計的機械翻訳複数の事前並べ替え候補を用いた句に基づく統計的機械翻訳
複数の事前並べ替え候補を用いた句に基づく統計的機械翻訳
 
Part 2: Unsupervised Learning Machine Learning Techniques
Part 2: Unsupervised Learning Machine Learning Techniques Part 2: Unsupervised Learning Machine Learning Techniques
Part 2: Unsupervised Learning Machine Learning Techniques
 
ChainerによるRNN翻訳モデルの実装+@
ChainerによるRNN翻訳モデルの実装+@ChainerによるRNN翻訳モデルの実装+@
ChainerによるRNN翻訳モデルの実装+@
 
Encoder-decoder 翻訳 (TISハンズオン資料)
Encoder-decoder 翻訳 (TISハンズオン資料)Encoder-decoder 翻訳 (TISハンズオン資料)
Encoder-decoder 翻訳 (TISハンズオン資料)
 
最近のDeep Learning (NLP) 界隈におけるAttention事情
最近のDeep Learning (NLP) 界隈におけるAttention事情最近のDeep Learning (NLP) 界隈におけるAttention事情
最近のDeep Learning (NLP) 界隈におけるAttention事情
 
Deep learning を用いた画像から説明文の自動生成に関する研究の紹介
Deep learning を用いた画像から説明文の自動生成に関する研究の紹介Deep learning を用いた画像から説明文の自動生成に関する研究の紹介
Deep learning を用いた画像から説明文の自動生成に関する研究の紹介
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
 

Similar a Tree-based Translation Models (『機械翻訳』§6.2-6.3)

Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introductionYueshen Xu
 
Latent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureLatent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureRakuten Group, Inc.
 
SPADE: Evaluation Dataset for Monolingual Phrase Alignment
SPADE: Evaluation Dataset for Monolingual Phrase AlignmentSPADE: Evaluation Dataset for Monolingual Phrase Alignment
SPADE: Evaluation Dataset for Monolingual Phrase AlignmentYuki Arase
 
(Very) Basic graphing with R
(Very) Basic graphing with R(Very) Basic graphing with R
(Very) Basic graphing with RKazuki Yoshida
 
Align, Disambiguate and Walk : A Unified Approach forMeasuring Semantic Simil...
Align, Disambiguate and Walk  : A Unified Approach forMeasuring Semantic Simil...Align, Disambiguate and Walk  : A Unified Approach forMeasuring Semantic Simil...
Align, Disambiguate and Walk : A Unified Approach forMeasuring Semantic Simil...Koji Matsuda
 
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...Fabian Pedregosa
 
DEseq, voom and vst
DEseq, voom and vstDEseq, voom and vst
DEseq, voom and vstQiang Kou
 
Word Segmentation and Lexical Normalization for Unsegmented Languages
Word Segmentation and Lexical Normalization for Unsegmented LanguagesWord Segmentation and Lexical Normalization for Unsegmented Languages
Word Segmentation and Lexical Normalization for Unsegmented Languageshs0041
 
Individual Brain Charting: third-release dataset validation
Individual Brain Charting: third-release dataset validationIndividual Brain Charting: third-release dataset validation
Individual Brain Charting: third-release dataset validationAna Luísa Pinho
 
The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...
The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...
The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...Normunds Grūzītis
 
Arabic syntactic parsing
Arabic syntactic parsingArabic syntactic parsing
Arabic syntactic parsingAmena dheif
 
Wellcome Trust Advances Course: NGS Course - Lecture1
Wellcome Trust Advances Course: NGS Course - Lecture1Wellcome Trust Advances Course: NGS Course - Lecture1
Wellcome Trust Advances Course: NGS Course - Lecture1Thomas Keane
 
Efficient BP Algorithms for General Feedforward Neural Networks
Efficient BP Algorithms for General Feedforward Neural NetworksEfficient BP Algorithms for General Feedforward Neural Networks
Efficient BP Algorithms for General Feedforward Neural NetworksFrancisco Zamora-Martinez
 
SP Study1018 Paper Reading
SP Study1018 Paper ReadingSP Study1018 Paper Reading
SP Study1018 Paper ReadingMori Takuma
 
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Te...
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Te...ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Te...
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Te...Tomoki Hayashi
 
Phonons & Phonopy: Pro Tips (2014)
Phonons & Phonopy: Pro Tips (2014)Phonons & Phonopy: Pro Tips (2014)
Phonons & Phonopy: Pro Tips (2014)Jonathan Skelton
 

Similar a Tree-based Translation Models (『機械翻訳』§6.2-6.3) (20)

Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introduction
 
Latent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureLatent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet Mixture
 
SPADE: Evaluation Dataset for Monolingual Phrase Alignment
SPADE: Evaluation Dataset for Monolingual Phrase AlignmentSPADE: Evaluation Dataset for Monolingual Phrase Alignment
SPADE: Evaluation Dataset for Monolingual Phrase Alignment
 
On the role of quantum mechanical simulation in materials science.
On the role of quantum mechanical simulation in materials science. On the role of quantum mechanical simulation in materials science.
On the role of quantum mechanical simulation in materials science.
 
(Very) Basic graphing with R
(Very) Basic graphing with R(Very) Basic graphing with R
(Very) Basic graphing with R
 
MT Study SCFG
MT Study SCFGMT Study SCFG
MT Study SCFG
 
Align, Disambiguate and Walk : A Unified Approach forMeasuring Semantic Simil...
Align, Disambiguate and Walk  : A Unified Approach forMeasuring Semantic Simil...Align, Disambiguate and Walk  : A Unified Approach forMeasuring Semantic Simil...
Align, Disambiguate and Walk : A Unified Approach forMeasuring Semantic Simil...
 
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Opti...
 
DEseq, voom and vst
DEseq, voom and vstDEseq, voom and vst
DEseq, voom and vst
 
Word Segmentation and Lexical Normalization for Unsegmented Languages
Word Segmentation and Lexical Normalization for Unsegmented LanguagesWord Segmentation and Lexical Normalization for Unsegmented Languages
Word Segmentation and Lexical Normalization for Unsegmented Languages
 
Individual Brain Charting: third-release dataset validation
Individual Brain Charting: third-release dataset validationIndividual Brain Charting: third-release dataset validation
Individual Brain Charting: third-release dataset validation
 
The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...
The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...
The Role of CNL and AMR in Scalable Abstractive Summarization for Multilingua...
 
Arabic syntactic parsing
Arabic syntactic parsingArabic syntactic parsing
Arabic syntactic parsing
 
Wellcome Trust Advances Course: NGS Course - Lecture1
Wellcome Trust Advances Course: NGS Course - Lecture1Wellcome Trust Advances Course: NGS Course - Lecture1
Wellcome Trust Advances Course: NGS Course - Lecture1
 
Efficient BP Algorithms for General Feedforward Neural Networks
Efficient BP Algorithms for General Feedforward Neural NetworksEfficient BP Algorithms for General Feedforward Neural Networks
Efficient BP Algorithms for General Feedforward Neural Networks
 
Thesis biobix
Thesis biobixThesis biobix
Thesis biobix
 
SP Study1018 Paper Reading
SP Study1018 Paper ReadingSP Study1018 Paper Reading
SP Study1018 Paper Reading
 
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Te...
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Te...ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Te...
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Te...
 
Phonons & Phonopy: Pro Tips (2014)
Phonons & Phonopy: Pro Tips (2014)Phonons & Phonopy: Pro Tips (2014)
Phonons & Phonopy: Pro Tips (2014)
 
first_seminar
first_seminarfirst_seminar
first_seminar
 

Último

Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 

Último (20)

Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 

Tree-based Translation Models (『機械翻訳』§6.2-6.3)

  • 1. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 1 Tree-based Translation Models Yusuke Oda @odashi_t 2014/6/5 NAIST MT-Study Group
  • 2. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 2 Agenda ● (6.2) Synchronous Context Free Grammar (SCFG) – (6.2.2) Learning SCFG – (6.2.3) Introducing Syntax Labels – (6.2.4) Features – (6.2.5) Decoding – (6.2.6) Rescoring ● (6.3) Synchronous Tree Substitution Grammar (STSG) – (6.3.1) Characteristics of STSG – (6.3.2) Learning STSG – (6.3.3) Features – (6.4.4) Decoding – (6.3.5) Binarization ● (6.4) Synchronous Parsing – (6.4.1) Inversion Transduction Grammar (ITG) – (6.4.2) Span Pruning – (6.4.3) Beam Search – (6.4.4) Two Parsing Hiero Travatar
  • 3. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 3 Synchronous Context Free Grammar (SCFG)
  • 4. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 4 Learning SCFG ● Synchronous rules are retrieved from each parallel corpora and their word alignment . ● : Source sentence ● : Target sentence ● : Set of word alignment
  • 5. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 5 Closed Phrase Pair under Word Alignment ● A phrase pair is closed under its word alignment ● Phrase pair and alignment satisfy below: he will dissolve the diet in the near future 彼 は 近い うち に 国会 を 解散 する (国会 を → the diet)
  • 6. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 6 Extracting Abstract Rules ● We can make more abstract synchronous rules by replacing some words in a phrase pair into a non-terminal symbol, when the phrase pair covers other "small" phrase pair. dissolve the diet in the near future 近い うち に 国会 を 解散 する dissolve in the に 解散 する (国会 を, the diet) (近い うち, near future) (近い うち ... 解散 する, dissolve the ... near future) (X1 に X2 解散 する, dissolve X2 in the X1)
  • 7. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 7 Hiero Grammar ● Hierarchical phrase grammar (Hiero Grammar): – Set of all synchronous rule in the parallel corpus ● Algorithm: 1. where is the set of all possible phrase pair in the parallel corpora. 2. If a rule and a phrase pair satisfies then 3.
  • 8. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 8 Constraints of Hiero Rules ● To suppress size and ambiguity of Hiero grammar, we can introduce some constraints for rule extraction. ● Minimal phrase pair – (国会 を, the diet) ... BAD – (国会, the diet) ... GOOD ● Phrase length – (奈良 先端 科学 技術 大学院 大学 情報 科学 研究 科 自然 言語 処理 学 研究 室, ...) BAD (too many words) ● Number of symbol – X → 〈あらゆる X1 を 全て X2 の 方 へ ねじ曲げ た の だ, ...〉 BAD (too many symbols) ● Rank of rule – X → 〈X1 が X2 で X3 に X4 した, ...〉 BAD (too many non-terminals) the diet 国会 を
  • 9. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 9 Glue Rules ● To make large size sentence using small rules, we introduce glue rules as below:
  • 10. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 10 Introducing Syntax Labels ● Up to here, we considered basic ideas of Hiero rules. – non-terminal symbol are only and . ● This model is very simple, but very ambiguous. ● Next, we introduce syntax information into Hiero rules. = Syntax-augmented machine translation (SAMT) S NP VP PRP VBZ DT NN this is a pen NP Hiero Syntax + → SAMT
  • 11. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 11 Combinatorial Categorical Grammar (CCG) ● SAMT uses categories (≒partial structure of syntax label) based on the idea of combinatorial categorical grammar (CCG) . ● Categories: : Syntax label with absence of right-side child : Syntax label with absence of left-side child : Concatenation of two syntax labels and
  • 12. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 12 Extracting SAMT Rules dissolve the diet in the near future 近い うち に 国会 を 解散 する VP VB NP PP DT NNP IN NP DT JJ NN NP NP PP VP NPDT IN+DT VP/PP VPVB VP → 〈NPDT1 に NP2 解散 する, dissolve NP2 in the NPDT1〉 VP → 〈近い うち IN+DT1 国会 を VB2, VB2 the diet IN+DT1 near future〉 etc...
  • 13. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 13 Probabilistic Formalization of Hiero Model ● We consider that the translation problem using Hiero grammar is maximization of posterior probability (similar to phrase based model): ● And we assume the probability is modeled as log-linear model: : Set of derivation (≒ set of used synchronous rules) : Weights : Feature functions
  • 14. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 14 Features of Hiero Model (1) ● Generative model: likelihoods of translation probability Forward model: Backward model: where Forward Backward
  • 15. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 15 Features of Hiero Model (2) ● Generative model: likelihoods of translation probability Syntax model (f): Syntax model (e): where Syntax (f) Syntax (e)
  • 16. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 16 Features of Hiero Model (3) ● Lexical translation model: goodness of phrase alignment Forward model: Backward model: where Forward Backward
  • 17. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 17 Features of Hiero Model (4) ● Language model: measuring fluency of hypothesis Out-of-vocabulary (OOV) penalty: adjusting LM ● Length penalty: adjusting number of words in hypothesis Glueing penalty: adjusting number of glue rules in derivation
  • 18. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 18 Decoding of Hiero Model ● Now input sentence and set of SCFG rules are given, we find the optimal output sequence : : Set of possible derivation given a grammar : Sequence of terminal symbols in given derivationn
  • 19. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 19 Decoding Process 1. Calculate intersection between and . • = Generating syntax forest using CYK algorithm 2. Transform syntax forest into corresponding translation forest . 3. Output the sequence of terminal symbols in that maximizes model score. S NP VP PP NP V NP P NP が 犬 本 の 上に 座った S NP VP the dog V NP PP sat NP of P NP the upper on the book "犬 が 本 の 上 に 座った" "the dog sat on the book"
  • 20. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 20 Synchronous Tree Substitution Grammar (STSG)
  • 21. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 21 Synchronous Tree Substitution Grammar ● STSG is a extension of Tree Substitution Grammar (TSG) for bilingual analysis. ● STSG is a subset of Synchronous Tree Adjoining Grammar (STAG). ● Definition: SCFG (Hiero) STSG STAG U U Set of non-terminal symbol Start symbol Set of terminal symbol Set of rules Weight semiring
  • 22. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 22 Synchronous Rules of STSG ● Definition: where : Elementary tree (source language) : Elementary tree (target language) : Association between and ● All rules are also associated a weight: S x1:NP VP x2:NP V 開けた S x1:NP VP VBD x2:NP opened frontier
  • 23. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 23 Expressive Power of STSG ● SCFG cannot express the difference of syntax, but STSG can treat it. ● Example: – This synchronous rule cannot generate using more smaller SCFG rules because these trees not corresponds any structure. – STSG framework can treat these correspondence of tree structure directly. NP NP PP N P x1:CD PC 犬 が 匹 NP NNSx1:CD dogs
  • 24. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 24 Translation Models under STSG Framework ● In the STSG framework, we can use the sequence of frontier nodes (leaves of synchronous rule) instead of full tree. ● 4 translation models are available when we choose either tree or sequence of frontier as data structure about source and target language. Target : frontier Target : tree Source : frontier String-to-string translation (= SCFG) String-to-tree translation Source : tree Tree-to-string translation Tree-to-tree translation S x1:NP VP x2:NP V 開けた Tree sequence of frontier nodes
  • 25. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 25 Retrieving STSG Synchronous Rules ● Heuristic method (similar to SCFG rule extraction) : Syntax tree generated from source sentence : Syntax tree generated from target sentence dissolve the diet in the near future 近い うち に 国会 を 解散 する VP VB NP PP DT NNP IN NP DT JJ NN VP PP NP VP N P NP NP VP P VP x1:PP x2:NP VP V P 解散 する VP x1:PPx2:NPVB dissolve
  • 26. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 26 GHKM Algorithm ● Galley-Hopkins-Kinght-Marcu (GHKM) Algorithm – Generating STSG synchronous rules (string-to-tree rules) by composing minimal rules using inside-outside algorithm. Minimal rule Syntax tree 1. Detecting minimal rules from target syntax trees. 2. Generating large synchronous rules by composing minimal rules.
  • 27. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 27 GHKM: Alignment Span (1) ● Alignment span : – Set of indexes of words in source sentence aligned to partial tree ● Complement alignment span : – Set of indexes of words in source sentence aligned to other than ● Closure : – Minimum range that covers the alignment span
  • 28. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 28 GHKM: Alignment Span (2) he will dissolve the diet in the near future 彼 は 近い うち に 国会 を 解散 する VP VB NP PP DT NNP IN NP DT JJ NN S NP PRP MD
  • 29. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 29 GHKM: Admissible Node ● Admissible node: – Node in target syntax tree that satisfies: he will dissolve the diet in the near future VP VB NP PP DT NNP IN NP DT JJ NN S NP PRP MD 彼 は 近い うち に 国会 を 解散 する
  • 30. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 30 GHKM: Minimal Rule ● Split the syntax tree by admissible node he will dissolve the diet in the near future VP VB NP PP DT NNP IN NP DT JJ NN S NP PRP MD 彼 は 近い うち に 国会 を 解散 する VP x1:PP x2:NP x3:VB x x3 x2 x1 VP the near future x 近い うち DT JJ NN
  • 31. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 31 Extension for Tree-to-tree Model (1) ● We need to extract node pairs of two syntax trees that are admissible each other. ● First, find admissible nodes in given . ● A node pair satisfies below then they are bidirectional admissible: ● Span : – Minimum range over sentence that covers all terminal symbols in
  • 32. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 32 Extension for Tree-to-tree Model (2) dissolve the diet in the near future 近い うち に 国会 を 解散 する VP VB NP PP DT NNP IN NP DT JJ NN VP PP NP VP N P NP NP VP P
  • 33. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 33 Features of STSG Model (1) ● Generative model: likelihoods of translation probability
  • 34. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 34 Features of STSG Model (2) ● Lexical translation model: goodness of phrase alignment
  • 35. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 35 Features of STSG Model (3) ● Height penalty: adjusting depth of derivation ● Internal node penalty: adjusting total size of derivation ● Some features introduced to Hiero model are also available
  • 36. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 36 Decoding of STSG Model ● STSG decoding is basically same method as Hiero decoding: Depends on translation model
  • 37. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 37 Difference of Formalization of Each Model ● String-to-string model – Same model as Hiero (SCFG) model. ● String-to-tree model – Never use any informations from syntax of source sentence. ● Tree-to-string model ● Tree-to-tree model – Explicitly use syntax informations of source sentence. – Translation process can be divided into syntax analysis and decoding. Source sentence Syntax tree of source sentence Translation hypothesi(e)s Syntax analyzer Decoder Non-syntax-based translation Syntax(tree)-based translation
  • 38. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 38 Formalization of Syntax-based Translation ● Syntax-based translation model uses the syntax tree of source sentence. ● We can ignore because is already decided while syntax analysis.
  • 39. 14/06/05 Copyright (C) 2014 by Yusuke Oda, AHC-Lab, IS, NAIST 39 Questions & Discussions