12. RBMT
RBMT makes use of human encoded
linguistic rules for translation
Development of a RBMT system is
very expensive because it needs
plenty of human labour and takes a
long time (years)
Winter School 2013, Birmingham
13. RBMT
RBMT systems can reach good translation
quality after years of development in the
given domain.
Well developed RBMT systems tend to
better capture large size sentence
structures but perform worse on small size
expressions compared with SMT systems.
Winter School 2013, Birmingham
14. EBMT
An EBMT system translate sentences by
analog of existing translation examples
EBMT does not need deep analysis of
source text and may generate high quality
translation when similar examples are
found
Winter School 2013, Birmingham
16. EBMT
Quality of EBMT increases while we
get more examples.
A problem of EBMT is the coverage of
the examples, especially for long
sentences.
Winter School 2013, Birmingham
17. TM
Translation Memory directly output
existing target sentence when a very
similar source sentence is found in the
memory, or it outputs nothing.
Winter School 2013, Birmingham
18. SMT
SMT builds statistical models to predict the
probability of a target sentence being the
translation of a given source sentence.
To translate a given source sentence is just
to search for a target sentence with the
highest translation probability.
Winter School 2013, Birmingham
19. SMT
A large number of translation pairs (parallel
corpus) is needed to estimate the model
parameters.
To predict the translation, sentence pairs are
broken into smaller translation equivalence,
either in word level, or in phrase level or
syntax rule level.
Winter School 2013, Birmingham
23. Phrase-based SMT
Source
Target
Probability
Bushi (布什)
Bush
0.5
president Bush
0.3
the US president
0.2
Bush and
0.8
the president and
0.2
and Shalong
0.6
with Shalong
0.4
hold a meeting
0.7
had a meeting
0.3
Bushi yu (布什与)
yu Shalong (与沙龙)
juxing le huiang (举行了会谈)
Winter School 2013, Birmingham
25. Hierarchical Phrased-based SMT
Source
Target
Probability
juxing le huiang (举行了会谈)
hold a meeting
0.6
had a meeting
0.3
X a meeting
0.8
X a talk
0.2
hold a X
0.5
had a X
0.5
Bushi yu Shalong (布什与沙龙)
Bush and Sharon
0.8
Bushi X (布什X)
Bush X
0.7
X yu Y (X与Y)
X and Y
0.9
X huitang (X会谈)
juxing le X (举行了X)
Winter School 2013, Birmingham
27. Syntax-based SMT
Source
Target
Probability
VPB(VS(juxing) AS(le) NPB(huiang))
hold a meeting
0.6
(举行了会谈)
have a meeting
0.3
have a talk
0.1
hold a x1
0.5
have a x1
0.5
VPB(VS(juxing) AS(le) x1:NPB)
(举行了x1)
VP(PP(P(yu) x1:NPB) x2:VPB) (与 x1 x2) x2 with x1
0.9
IP(x1:NPB VP(x2:PP x3:VPB))
0.7
x1 x3 x2
Winter School 2013, Birmingham
28. SMT
SMT is cheap
SMT systems can be developed in a
short time
SMT needs a large number of parallel
corpus
Winter School 2013, Birmingham
29. SMT
SMT gets good quality translations if we
have plenty of in-domain data
SMT quality drops dramatically for out-ofdomain data
SMT results is fluent in short phrases but
not good at large size sentence structures
(esp. for distant languages)
Winter School 2013, Birmingham
30. Why Hybrid MT?
Each MT approach has its pros and cons.
We want to take advantage of different MT
approaches
We do not want to waste our investments
on existing MT systems
Winter School 2013, Birmingham
31. Outline
Why Hybrid MT?
An overview of Hybrid MT
Typical Hybrid MT Approaches
Conclusion
Winter School 2013, Birmingham
32. An overview of Hybrid MT
Selective MT: loose coupling
Pipelined MT: medium coupling
Mixture MT: close coupling
Winter School 2013, Birmingham
33. Selective MT
Given translations generated by
different approaches, Selective MT
tries to select a best one, or select
best parts from different translations
and combine them to a new one.
Winter School 2013, Birmingham
36. Selective MT
Typical Selective MT:
System Recommendation
System Combination
Sentence-level combination
word-level combination
Winter School 2013, Birmingham
37. Pipelined MT
Pipelined MT adopts one approach as
the main approach and use another
approach for monolingual preprocessing or post-processing.
Winter School 2013, Birmingham
39. Pipelined MT
Typical Pipelined MT:
Statistical Post-Editing for RBMT
Rule-based Pre-reordering for SMT
Winter School 2013, Birmingham
40. Mixture MT
Mixture MT adopts one approach as
the main approach but utilizes one or
more different approaches in some
components.
Winter School 2013, Birmingham
45. System Recommendation
Yifan He, Yanjun Ma, Josef van Genabith and Andy
Way, Bridging SMT and TM with System
Recommendation, Proceedings of the 48th Annual
Meeting of the Association for Computational
Linguistics (ACL2010), pages 622–630, Uppsala,
Sweden, 11-16 July 2010.
Winter School 2013, Birmingham
46. System Recommendation
Intuition:
In some cases when we have enough big
translation memory, the trained SMT system is
comparable with TM output in translation quality.
Here comes the problem of selection.
System recommendation recommends SMT
outputs to a TM user when it predicts that SMT
outputs are more suitable for post-editing than
the hits provided by the TM
Winter School 2013, Birmingham
48. System Recommendation
A SVM binary classifier is adopted
The classifier is trained on humanannotated data
A confidence score is given for the
recommendation
Winter School 2013, Birmingham
49. System Recommendation
SMT System Features: features used in the SMT system
TM Feature: Fuzzy Match Cost
System Independent Features:
Source-Side Language Model Score and Perplexity
Target-Side Language Model Perplexity
The Pseudo-Source Fuzzy Match Score
The IBM Model 1 Score.
Winter School 2013, Birmingham
50. System Recommendation
Evaluation Metrics:
Where A is the set of recommended MT
outputs, and B is the set of MT outputs that
have lower TER than TM hits.
Winter School 2013, Birmingham
54. System Combination
Rosti, A. V. I., Ayan, N. F., Xiang, B., Matsoukas,
S., Schwartz, R. M., & Dorr, B. J. (2007, April).
Combining Outputs from Multiple Machine
Translation Systems. In HLT-NAACL (pp. 228-235).
Winter School 2013, Birmingham
55. System Combination
Rosti, A. V. I., Matsoukas, S., & Schwartz, R. (2007,
June). Improved word-level system combination for
machine translation. In ANNUAL MEETINGASSOCIATION FOR COMPUTATIONAL
LINGUISTICS (Vol. 45, No. 1, p. 312).
Winter School 2013, Birmingham
56. System Combination
He, X., Yang, M., Gao, J., Nguyen, P., & Moore, R.
2008. Indirect-HMM-based hypothesis alignment for
combining outputs from machine translation systems.
In Proceedings of the Conference on Empirical
Methods in Natural Language Processing (pp. 98-107).
Association for Computational Linguistics.
Winter School 2013, Birmingham
57. System Combination
Feng, Y., Liu, Y., Mi, H., Liu, Q., & Lü, Y. 2009. Latticebased system combination for statistical machine
translation. In Proceedings of the 2009 Conference on
Empirical Methods in Natural Language Processing:
Volume 3-Volume 3 (pp. 1105-1113). Association for
Computational Linguistics.
Winter School 2013, Birmingham
58. Sentence-Level
System Combination
Kumar, S., & Byrne, W. J. (2004, May).
Minimum Bayes-Risk Decoding for
Statistical Machine Translation. In
HLT-NAACL (pp. 169-176).
Winter School 2013, Birmingham
59. Sentence-Level
System Combination
Consider we have several MT systems
For a given source text F, each MT system
output a n-best target text
If possible, MT system gives each target
text a probability P(E|F), or we may
consider the n-best target text with equal
probabilities.
Winter School 2013, Birmingham
61. Word-Level
System Combination
Select a translation candidate as a skeleton
(backbone) with Minimal Bayes Risk
Construct a confusion network by aligning
all the words in other translation candidates
to the words in the skeleton
Select the best path from the confusion
network and generate a new translation
Winter School 2013, Birmingham
65. Word-Level
System Combination
System combination is proved to be very
effective
In NIST Open MT Evaluation ChineseEnglish task, MSR-NRC-SRI ranked no.1
by using system combination technologies
In later NIST evaluations, different tracks
are defined participants using or not using
system combination technologies.
Winter School 2013, Birmingham
66. Typical Hybrid MT Approaches
Selective MT
Pipelined MT
Statistical Post-Editing for RBMT
Rule-based Pre-reordering for SMT
Mixture MT
Winter School 2013, Birmingham
67. Statistical Post-Editing for RBMT
Dugast, L., Senellart, J., & Koehn, P. (2007, June).
Statistical post-editing on SYSTRAN's rule-based
translation system. In Proceedings of the Second
Workshop on Statistical Machine Translation (pp.
220-223). Association for Computational
Linguistics.
Winter School 2013, Birmingham
68. Statistical Post-Editing for RBMT
Simard, M., Ueffing, N., Isabelle, P., & Kuhn, R.
(2007). Rule-based Translation With Statistical
Phrase-based Post-editing. Second Workshop on
Statistical Machine Translation. Prague, Czech
Republic. June 23, 2007. pp. 203–206.
Winter School 2013, Birmingham
69. Statistical Post-Editing
When we have:
A very good RBMT system
Large number of parallel corpus which can be
used for SMT training
Both RBMT and SMT have advantages and
disadvantages
Can we make benefits from both methods?
Winter School 2013, Birmingham
70. Statistical Post-Editing
A Statistical Post-Editing (SPE) system is a
monolingual SMT system which takes the result of a
RBMT system as input and generate a improved
target output.
Source
Text
RBMT
RBMT
Result
SPE
SPE
Result
Winter School 2013, Birmingham
71. Statistical Post Edit: Training
Source
Target
RBMT
RBMT
Target
SPE
Training
SPE
Target
Winter School 2013, Birmingham
72. Statistical Post Edit: Training
RBMT usually generates a better word
order while SMT can make better
lexical selection.
RBMT+SPE outperforms the original
RBMT and SMT systems.
Winter School 2013, Birmingham
73. Typical Hybrid MT Approaches
Selective MT
Pipelined MT
Statistical Post-Editing for RBMT
Rule-based Pre-reordering for SMT
Mixture MT
Winter School 2013, Birmingham
74. Rule-based Pre-reordering for SMT
Elia Yuste, Manuel Herranz, Alexandra Helle and
Hirokazu Suzuki, Go Hybrid: Pangeanic's and Toshiba's
First Steps Towards ENJP MT Hybridization, AAMT
Journal, No.50, December 2011 (Part B for this tutorial)
Winter School 2013, Birmingham
75. Rule-based Pre-reordering for SMT
Xia, F., & McCord, M. (2004, August). Improving a
statistical MT system with automatically learned rewrite
patterns. In Proceedings of the 20th international
conference on Computational Linguistics (p. 508).
Association for Computational Linguistics.
Winter School 2013, Birmingham
76. Rule-based Pre-reordering for SMT
A phrase-based SMT (PBSMT) system
performs good lexical choices but is not
good at long distance reordering without
linguistics knowledge
A rule-based word-reordering on the source
side is conducted to make the word order of
the source text much more similar with the
word order in the target side.
Winter School 2013, Birmingham
77. Rule-based Pre-reordering for SMT
Source
Text
PreReordering
Reordered
Source Text
PBSMT
Target
Text
Winter School 2013, Birmingham
79. Pre-reordering: Training
The rule for pre-ordering can be
automatic acquired from the parallel
corpus with automatic word alignment
and parsing trees in both side.
Winter School 2013, Birmingham
80. Pre-reordering: Training
Parsing the source sentence
Parsing the target sentence
Align the words and the phrases in
both sides
Extract the rewrite rules
Winter School 2013, Birmingham
86. Typical Hybrid MT Approaches
Selective MT
Pipelined MT
Mixture MT
Statistical Parsing in RBMT
Rule-based Named Entity Translation in SMT
Human-Acquired Rules in SMT
SMT Decoding with TM Phrases
Winter School 2013, Birmingham
87. Statistical Parsing in RBMT
Statistical parsing outperforms rulebased parsing if we have large scale
treebank.
It is reasonable to use statistical
algorithm in the parsing component in
a RBMT system.
Winter School 2013, Birmingham
88. Rule-based Named Entity Translation
in SMT
Ney, H. (2013). Statistical MT Systems Revisited:
How much Hybridity do they have? Proceedings of
the Second Workshop on Hybrid Approaches to
Translation, page 7, Sofia, Bulgaria, August 8,
2013.
Winter School 2013, Birmingham
90. Human-Acquired Rules in SMT
Li, X., Lü, Y., Meng, Y., Liu, Q., & Yu, H.
Feedback Selecting of Manually Acquired
Rules Using Automatic Evaluation.
Proceedings of the 4th Workshop on Patent
Translation, pages 52-59, MT Summit XIII,
Xiamen, China, September 2011
Winter School 2013, Birmingham
91. Human-Acquired Rules in SMT
These rules are used in the decoding process
together with the Hierarchical Phrases in a
SMT system
Winter School 2013, Birmingham
92. SMT Decoding with TM Phrases
Philipp Koehn and Jean Senellart. 2010. Convergence of
translation memory and statistical machine translation. In
AMTA Workshop on MT Research and the Translation
Industry, pages 21–31.
Wang, K., Zong, C., & Su, K. Y. Integrating Translation
Memory into Phrase-Based Machine Translation during
Decoding. Proceedings of the 51st Annual Meeting of the
Association for Computational Linguistics, pages 11–21,
Sofia, Bulgaria, August 4-9 2013
Winter School 2013, Birmingham
93. SMT Decoding with TM Phrases
Yanjun Ma, Yifan He, Andy Way and Josef van Genabith.
2011. Consistent translation using discriminative learning: a
translation memory-inspired approach. In Proceedings of the
49th Annual Meeting of the Association for Computational
Lingui stics, pages 1239–1248, Portland, Oregon.
Yifan He, Yanjun Ma, Andy Way and Josef van Genabith.
2011. Rich linguistic features for translation memory-inspired
consistent translation. In Proceedings of the Thirteenth
Machine Translation Summit, pages 456–463.
Winter School 2013, Birmingham
94. SMT Decoding with TM Phrases
Extract TM phrases from similar
sentences in the translation memory
and use them in the decoding process
in the runtime.
Winter School 2013, Birmingham
95. Outline
Why Hybrid MT?
An overview of Hybrid MT
Typical Hybrid MT Approaches
Conclusion
Winter School 2013, Birmingham
96. Conclusion
Different MT approaches have advantages and
disadvantages, which are usually complementary.
Hybrid MT can take benefit from different MT
approaches
Three categories of Hybrid MT is introduced:
Selective, Pipelined and Mixture.
Actually almost all the real MT systems are hybrid
system.
Winter School 2013, Birmingham