11. Translation Memory Exchange
•OSCAR (Open Standards for Container/Content
Allowing Re-use)
•TMX Standard (Translation Memory eXchange).
•Leveraging of translation memories regardless the
tool or platform.
12. The ancestors of CAT Tools…
XL8
DOS tool in a workflow known as XLN
26. Basic TM features in CAT tools
Leverage of previous translations.
Analysis for quoting, planning and keeping
track of progress.
Concordance for sub-segment searches.
Maintenance to perform global changes,
import/export content, etc.
27. Leveraging TMs
CAT tools provide answers to these questions:
What is the fuzzy match of the segment?
What parts of the text are different?
Where is the match coming from?
34. Different tools, different word counts
CAT Tool 1
CAT Tool 2
101%
41,352
101%
29,782
100%
4194
100%
16,002
99-95%
3698
99-95%
6038
94-85%
2077
94-85%
2633
84-75%
5270
84-75%
1369
New words
5241
New words
6150
Repetitions
2068
Repetitions
5451
Total
63,900
Total
58,425
35. Different word counts
There is no standard fuzzy matching algorithm.
CAT tools may have different auto-substitution elements:
numbers, dates, acronyms, variables, etc.
Different approaches to 101% matches.
Cross-file repetitions and internal fuzzy leverage.
Different file format filters.
Different segmentation rules.
SRX is the standard for segmentation rules.
36. Weighted word count
Each band is assigned a percentage of the full word rate
according to a weighting scheme (negotiable per client). For
example:
101%
0%
100%
20%
99-95%
30%
94-85%
40%
84-75%
50%
New words
100%
Repetitions
20%
37. Different tools, different word counts (II)
CAT Tool 1
Band
41,352
Weighted
words
Words
101%
CAT Tool 2
x 0%
Band
Words
0
101%
29782
Weighted
words
x 0%
0
100%
4194 x 20%
839
100%
16002 x 20%
3200
99-95%
3698 x 30%
1109
99-95%
6038 x 30%
1811
94-85%
2077 x 40%
831
94-85%
2633 x 40%
1053
84-75%
5270 x 50%
2635
84-75%
1369 x 50%
684
New words
5241 x 100%
5241
New words
6150 x 100%
6150
Repetitions
2068 x 20%
414
Repetitions
5451 x 20%
1090
11,069
Total
Total
63,900
58,425
14,989
39. TMs and statistical analysis
If big enough, TMs provide the bilingual corpus
necessary to build SMT engines.
Some CAT tools can scan the TM in search of
correlation between words in source and target.