Multi-Target Machine Translation with Multi-Synchronous Context-free Grammars @NAACL読み会_KomachiLab

1. Mul$-‐Target Machine Transla$on with Mul$-‐Synchronous Context-‐free Grammars Graham Neubig and Philip Arthur and Kevin Duh Presenter：Shin Kanouchi NAACL_Reading2015@KomachiLab, TMU 1

2. Mo#va#on •  When transla$ng into language T1, equivalent transla$ons into a second language T2 can help •  T1 has a weak language model •  T2 has a strong language model can we use a T2 in to improve results? 2

3. Overall view •  Mo$va$on •  Propose mul*-‐synchronous context-‐free grammars (MSCFGs) •  How to Learning MSCFGs •  How to perform search (Decoding) – including calcula$on of LM probabili$es over mul$ple target language strings •  Experiment – gains of up to 0.8-‐1.5 BLEU points 3

4. Proposed Framework •  Build on the well-‐known synchronous context-‐ free grammars (SCFG) •  Propose mul*-‐synchronous context-‐ free grammars (MSCFGs), with mul$ple targets 4

5. How to learning MSCFGs •  Learn from tri-‐lingual parallel data 1.  Alignment •  alignments for each sentence automa$cally •  IBM models implemented by GIZA++ (Och and Ney, 2003) 2.  Phrase Extrac$on 3.  Calculate Features Source: T2: T1: 5 independent

6. How to learning MSCFGs •  Learn from tri-‐lingual parallel data 1.  Alignment 2.  Phrase Extrac$on •  phrase-‐extract algorithm of Och (2002) •  Source → T1 •  a → 了 •  ra$fié → 批准 •  a ra$fié → 批准了 3.  Calculate Features •  Source → T2 •  X •  X •  a ra$fié → ra$fied → a ra$fié → 批准了 | ra$fied 6

8. Decoding (a) one LM (only T1) (b) joint search method, is based on consecu$vely expanding the LM states of both T1 and T2 (c) sequen*al search method, ﬁrst expands the state space of T1, then later expands the search space of T2. 8

9. Experiments •  Mul$ UN Corpus: –  Parallel, T1 LM data: 100,000 Sentences –  T2 LM data: 4,000,000 Sentences S: en T1, T2: ar, es, fr, ru, zh (all combina$ons) •  Decoder: –  Travatar (Neubig, 2013) •  Baseline: –  A standard SCFG grammar with only the source and T1 •  Proposed: –  The full MSCFG model with the T2 LM 9

10. Result 1 •  T2 = Spanish (best results) •  Par$cularly eﬀec$ve in similar languages BLEU 10

11. Result 2 •  BLEU scores for diﬀerent T1 LM sizes without (-‐LM2) or with (+LM2) an LM for the second target. 11

Multi-Target Machine Translation with Multi-Synchronous Context-free Grammars @NAACL読み会_KomachiLab

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (18)

Destacado

Destacado (6)

Similar a Multi-Target Machine Translation with Multi-Synchronous Context-free Grammars @NAACL読み会_KomachiLab

Similar a Multi-Target Machine Translation with Multi-Synchronous Context-free Grammars @NAACL読み会_KomachiLab (20)

Último

Último (20)

Multi-Target Machine Translation with Multi-Synchronous Context-free Grammars @NAACL読み会_KomachiLab