SlideShare una empresa de Scribd logo
1 de 51
USER CONFERENCE 2009 BEFORE MT
Normalization of translation memories/training data for MT Moderator: Karen R. Combe, PTC Ryan Martin, Intel Chris Wendt, Microsoft William Wong, Language Weaver Olga Beregovaya ProMT
Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object]
Issue: Excessive number of internal tags Pour effectuer la plupart de ces tâches, vous pouvez utiliser  {1}{2} Fichier (File) {3}{4} Traitement des instances (Instance Operations) {5}{6} Actualiser l'index (Update Index) {7}{8}  ou  {9}{10} Fichier (File) {11}{12} Traitement des instances (Instance Operations) {13}{14} Options d'accélérateur (Accelerator Options) {15}{16}  afin d'ouvrir la boîte de dialogue  {17} Accélérateur d'instances (Instance Accelerator) {18} You can use  {1}{2} File {3}{4} Instance Operations {5}{6} Update Index {7}{8}{9}{10} File {11}{12} Instance Operations {13} {14} Accelerator Options {15}{16}  (which opens the  {17} Instance Accelerator {18}  dialog box) to perform most instance operations.
Issue: Irrelevant data English: 0.31% French:  0,31 % English: &asm.mbr.name==part* French:  &asm.mbr.name==pièce* English: (Windows NT/95/98/2000)D:artlib1}bjects French:  (Windows NT/95/98/2000)D:artlib1}bjects
Issue: homonyms English: This figure shows that after midsurface compression, the resulting model develops a gap between the collet and the  bracket . French: Cette figure montre qu'après la compression en feuillet moyen, le modèle obtenu crée un jeu entre le collet et le  gousset . English: All data in  brackets  [] are optional. French:  Toutes les données entre  crochets  [] sont facultatives. Bracket #1 (gousset):  An overhanging member that projects from a structure (as a wall) and is usually designed to support a vertical load or to strengthen an angle. bracket #2 (crochet):  The bracket character, such as [ or (.
Issue: Acronyms spelled out in the target English: You cannot propagate  SDTAE s and  DTAE s in a  DTAF . French: Vous ne pouvez propager ni des  éléments d'annotation d'étiquette de référence  ni des  éléments d'annotation de référence de positionnement  à l'intérieur d'une FARP.
Issue: Mismatching number of sentences English: You can have multiple entries for the same pipe size in the bend file, that is, a single pipe size can have multiple bend radius values associated with it, as shown in the following example of a bend file. French:  Vous pouvez avoir plusieurs entrées pour la même taille de tuyau dans le fichier de pliage.   En d'autres termes, une même taille de tuyau peut être associée à plusieurs valeurs de rayon de pliage, comme dans le fichier de pliage d'exemple suivant.
Issue: Inconsistent double quote usage Ainsi, si vous créez une pièce portant le nom  " bracket " , elle est tout d'abord enregistrée dans le fichier {1}. For example, if you create a part with the name bracket, it initially saves to the file name {1}.
Issue: Entity mismatch English: One way is to create a  " flexible model. French:  Une méthode consiste à créer un modèle souple.
Issue: Punctuation mismatch (brace vs. dash) English: {1}Copy as Skeleton{2}  ( the option cannot be changed )  to create a skeleton model. French: Cliquez sur {1}Copier en tant que squelette (Copy as Skeleton){2}  -  option non modifiable  -  pour créer un modèle squelette.
Issue: Punctuation mismatch (dash vs. colon) English: {1}Additional Rotation{2}  —  Enter a real-number value for the number of degrees to rotate the spring's Y axis. French: {1}Rotation supplémentaire (Additional Rotation){2}  :  entrez un nombre réel pour indiquer le nombre de degrés de rotation de l'axe Y du ressort.
Issue: Capitalization mismatch English: Piping Master Catalog Directory File French:  Fichier répertoire du catalogue principal de tuyauterie
Issue: English UI strings in the translation English: Click View > Color and Appearance to create or modify colors. Cliquez sur Affichage ( View ) > Couleur et apparence ( Color and Appearance ) pour créer ou modifier les couleurs.
Issue: Fix common entity issues ,[object Object],[object Object],Corrected: English: System without Intel ®  vPro technology Portuguese: Sistema sem a tecnologia Intel ®  vPro
Issue: Remove internal markup <tuv xml:lang=&quot;ZH-CN&quot;> <seg> <bpt i=&quot;1&quot;>&lt;span style='font-size:10.0pt; font-family:Verdana'></bpt> 在默认情况下,节点 <bpt i=&quot;2&quot; type=&quot;bold&quot;>&lt;b></bpt> 应用程序 <ept i=&quot;2&quot;>&lt;/b></ept> 之下没有任何应用程序,如下图所示。 <ept i=&quot;1&quot;>&lt;/span></ept> </seg></tuv></tu> Corrected: <tuv xml:lang=&quot;ZH-CN&quot;> <seg> 在默认情况下,节点应用程序之下没有任何应用程序,如下图所示。 </seg> </tuv> </tu>
Issue: Empty field ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Issue: Suspect character ,[object Object],[object Object]
Issue: Suspect character ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Issue: Escape character in translation ,[object Object],[object Object]
Issue: Trivial segment; missing sentence features ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Issue: Incomplete translation, missing punctuation ,[object Object],[object Object]
Data Issues ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Issues ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Issues ,[object Object],[object Object]
Data Issues ,[object Object],[object Object],[object Object],[object Object]
Data Issues ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Issues ,[object Object],[object Object],[object Object],[object Object],[object Object]
Training Data example ,[object Object]
Metadata handling by PROMT
Standard TM verification/normalization process ,[object Object],[object Object],[object Object],[object Object],[object Object]
PROMT handling of internal tags – not excessive but useful  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
GMS Integration with XLIFF Connector – Why is metadata so Important?
PROMT handling of irrelevant data ,[object Object],[object Object],[object Object]
PROMT handling of homonyms ,[object Object],[object Object],[object Object]
PROMT handling of expanding acronyms ,[object Object],[object Object],[object Object]
PROMT handling of locale-specific punctuation  ,[object Object],[object Object]
PROMT handling of Entity and Capitalization mismatch ,[object Object],[object Object]
Issue: English UI strings in the translation English: Click View > Color and Appearance to create or modify colors. Cliquez sur Affichage ( View ) > Couleur et apparence ( Color and Appearance ) pour créer ou modifier les couleurs.
PROMT suggestion for UI string handling ,[object Object],[object Object]
Intuitive contextual identification Any word that occurs as part of a context such as “show” in “show command,” remains in English per the UI, whereas the word command gets translated.  In other contexts, both words, show and command, are translated as regular words.
PROMT approach to entities ,[object Object],[object Object]
PROMT handling of internal markup ,[object Object],[object Object],[object Object]
PROMT handling of empty fields ,[object Object],[object Object],[object Object]
Microsoft Translator
In General ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Automatic training data filtering and conversion ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Cleaning issues 1/3 Issue Training Action Runtime action Excessive # of internal tags Remove segment Preserve and ignore Irrelevant data Fails in ratio filter Apply factoids Homonyms n/a Target language model Acronyms spelled out May be caught in word alignment Project dictionary # sentences mismatch Sentence break, align, discard n/a Inconsistent quote usage n/a Not handled Entities Unescape Unescape-reescape    XML-safe Punctuation mismatch n/a Needs special code (i.e. French “ :”)
Cleaning issues 2/3 Issue Training Action Runtime action Capitalization mismatch ignored Apply language logic and target language model English UI strings Factoid Preprocess and escape Internal markup Escape to single tag Pass through Empty field Fails size delta filter Pass through Suspect character May fail language check Pass through HTML escapes Unescape Unescape – reescape for XML Trivial segment Fails length or ratio filter Pass through Missing punctuation none Apply language-appropriate punctuation
Cleaning issues 3/3 Issue Training Action Runtime action Newline in string New sentence New sentence Program code Avoided or learned Needs markup Comparable data Currently fails length filter Research item Handled like “parallel”

Más contenido relacionado

Destacado

Arrozais japoneses
Arrozais japonesesArrozais japoneses
Arrozais japonesescab3032
 
Wiki art project lightning talk
Wiki art project lightning talkWiki art project lightning talk
Wiki art project lightning talkKaterina Nerush
 
Paul's struggle - Vaincre l'Autisme
 Paul's struggle - Vaincre l'Autisme Paul's struggle - Vaincre l'Autisme
Paul's struggle - Vaincre l'AutismeHETIC
 
V. evans - successful writing proficiency
V. evans  - successful writing proficiencyV. evans  - successful writing proficiency
V. evans - successful writing proficiencyJavier Eduardo Portela
 

Destacado (7)

Hospitalmed 02 a 04 de Setembro de 2015
Hospitalmed 02 a 04 de Setembro de 2015 Hospitalmed 02 a 04 de Setembro de 2015
Hospitalmed 02 a 04 de Setembro de 2015
 
Arrozais japoneses
Arrozais japonesesArrozais japoneses
Arrozais japoneses
 
Phoenix 4jours
Phoenix 4joursPhoenix 4jours
Phoenix 4jours
 
Wiki art project lightning talk
Wiki art project lightning talkWiki art project lightning talk
Wiki art project lightning talk
 
Paul's struggle - Vaincre l'Autisme
 Paul's struggle - Vaincre l'Autisme Paul's struggle - Vaincre l'Autisme
Paul's struggle - Vaincre l'Autisme
 
Prozia pmp
Prozia pmpProzia pmp
Prozia pmp
 
V. evans - successful writing proficiency
V. evans  - successful writing proficiencyV. evans  - successful writing proficiency
V. evans - successful writing proficiency
 

Similar a USER CONFERENCE 2009 BEFORE MT NORMALIZATION

Similar a USER CONFERENCE 2009 BEFORE MT NORMALIZATION (20)

LocalizingStyleSheetsForHTMLOutputs
LocalizingStyleSheetsForHTMLOutputsLocalizingStyleSheetsForHTMLOutputs
LocalizingStyleSheetsForHTMLOutputs
 
PDF Localization
PDF  LocalizationPDF  Localization
PDF Localization
 
Odp
OdpOdp
Odp
 
Csphtp1 18
Csphtp1 18Csphtp1 18
Csphtp1 18
 
Eff Plsql
Eff PlsqlEff Plsql
Eff Plsql
 
Python Presentation
Python PresentationPython Presentation
Python Presentation
 
Php
PhpPhp
Php
 
Device tree support on arm linux
Device tree support on arm linuxDevice tree support on arm linux
Device tree support on arm linux
 
Debugging and Error handling
Debugging and Error handlingDebugging and Error handling
Debugging and Error handling
 
AD215 - Practical Magic with DXL
AD215 - Practical Magic with DXLAD215 - Practical Magic with DXL
AD215 - Practical Magic with DXL
 
Xml
XmlXml
Xml
 
XML processing with perl
XML processing with perlXML processing with perl
XML processing with perl
 
Pmm05 16
Pmm05 16Pmm05 16
Pmm05 16
 
Processing XML with Java
Processing XML with JavaProcessing XML with Java
Processing XML with Java
 
An Overview Of Standard C++Tr1
An Overview Of Standard C++Tr1An Overview Of Standard C++Tr1
An Overview Of Standard C++Tr1
 
clang-intro
clang-introclang-intro
clang-intro
 
Tugas Pw [6]
Tugas Pw [6]Tugas Pw [6]
Tugas Pw [6]
 
Tugas Pw [6] (2)
Tugas Pw [6] (2)Tugas Pw [6] (2)
Tugas Pw [6] (2)
 
Introduction To Lamp
Introduction To LampIntroduction To Lamp
Introduction To Lamp
 
Introduction to XML
Introduction to XMLIntroduction to XML
Introduction to XML
 

Más de TAUS - The Language Data Network

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS - The Language Data Network
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS - The Language Data Network
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...TAUS - The Language Data Network
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)TAUS - The Language Data Network
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...TAUS - The Language Data Network
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...TAUS - The Language Data Network
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...TAUS - The Language Data Network
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...TAUS - The Language Data Network
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...TAUS - The Language Data Network
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...TAUS - The Language Data Network
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)TAUS - The Language Data Network
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...TAUS - The Language Data Network
 

Más de TAUS - The Language Data Network (20)

TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflec...
 
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
TAUS Global Content Summit Amsterdam 2019 / Measure with DQF, Dace Dzeguze (T...
 
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
TAUS Global Content Summit Amsterdam 2019 / Automatic for the People by Domin...
 
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
TAUS Global Content Summit Amsterdam 2019 / The Quantum Leap: Human Parity, C...
 
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
TAUS Global Content Summit Amsterdam 2019 / Growing Business by Connecting Co...
 
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
Achieving Translation Efficiency and Accuracy for Video Content, Xiao Yuan (P...
 
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
Introduction Innovation Contest Shenzhen by Henri Broekmate (Lionbridge)
 
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann... Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
Game Changer for Linguistic Review: Shifting the Paradigm, Klaus Fleischmann...
 
A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...A translation memory P2P trading platform - to make global translation memory...
A translation memory P2P trading platform - to make global translation memory...
 
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
Shiyibao — The Most Efficient Translation Feedback System Ever, Guanqing Hao ...
 
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
Stepes – Instant Human Translation Services for the Digital World, Carl Yao (...
 
Farmer Lv (TrueTran)
Farmer Lv (TrueTran)Farmer Lv (TrueTran)
Farmer Lv (TrueTran)
 
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
Smart Translation Resource Management: Semantic Matching, Kirk Zhang (Wiitran...
 
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 The Theory and Practice of Computer Aided Translation Training System, Liu Q... The Theory and Practice of Computer Aided Translation Training System, Liu Q...
The Theory and Practice of Computer Aided Translation Training System, Liu Q...
 
Translation Technology Showcase in Shenzhen
Translation Technology Showcase in ShenzhenTranslation Technology Showcase in Shenzhen
Translation Technology Showcase in Shenzhen
 
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
How to efficiently use large-scale TMs in translation, Jing Zhang (Tmxmall)
 
SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)SDL Trados Studio 2017, Jocelyn He (SDL)
SDL Trados Studio 2017, Jocelyn He (SDL)
 
How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)How we train post-editors - Yongpeng Wei (Lingosail)
How we train post-editors - Yongpeng Wei (Lingosail)
 
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 A use-case for getting MT into your company, Kerstin Berns (berns language c... A use-case for getting MT into your company, Kerstin Berns (berns language c...
A use-case for getting MT into your company, Kerstin Berns (berns language c...
 
QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)QE integrated in XTM, by Bob Willans (XTM)
QE integrated in XTM, by Bob Willans (XTM)
 

Último

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Último (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

USER CONFERENCE 2009 BEFORE MT NORMALIZATION

  • 2. Normalization of translation memories/training data for MT Moderator: Karen R. Combe, PTC Ryan Martin, Intel Chris Wendt, Microsoft William Wong, Language Weaver Olga Beregovaya ProMT
  • 3.
  • 4.
  • 5. Issue: Excessive number of internal tags Pour effectuer la plupart de ces tâches, vous pouvez utiliser {1}{2} Fichier (File) {3}{4} Traitement des instances (Instance Operations) {5}{6} Actualiser l'index (Update Index) {7}{8} ou {9}{10} Fichier (File) {11}{12} Traitement des instances (Instance Operations) {13}{14} Options d'accélérateur (Accelerator Options) {15}{16} afin d'ouvrir la boîte de dialogue {17} Accélérateur d'instances (Instance Accelerator) {18} You can use {1}{2} File {3}{4} Instance Operations {5}{6} Update Index {7}{8}{9}{10} File {11}{12} Instance Operations {13} {14} Accelerator Options {15}{16} (which opens the {17} Instance Accelerator {18} dialog box) to perform most instance operations.
  • 6. Issue: Irrelevant data English: 0.31% French: 0,31 % English: &amp;amp;asm.mbr.name==part* French: &amp;asm.mbr.name==pièce* English: (Windows NT/95/98/2000)D:artlib1}bjects French: (Windows NT/95/98/2000)D:artlib1}bjects
  • 7. Issue: homonyms English: This figure shows that after midsurface compression, the resulting model develops a gap between the collet and the bracket . French: Cette figure montre qu'après la compression en feuillet moyen, le modèle obtenu crée un jeu entre le collet et le gousset . English: All data in brackets [] are optional. French: Toutes les données entre crochets [] sont facultatives. Bracket #1 (gousset): An overhanging member that projects from a structure (as a wall) and is usually designed to support a vertical load or to strengthen an angle. bracket #2 (crochet): The bracket character, such as [ or (.
  • 8. Issue: Acronyms spelled out in the target English: You cannot propagate SDTAE s and DTAE s in a DTAF . French: Vous ne pouvez propager ni des éléments d'annotation d'étiquette de référence ni des éléments d'annotation de référence de positionnement à l'intérieur d'une FARP.
  • 9. Issue: Mismatching number of sentences English: You can have multiple entries for the same pipe size in the bend file, that is, a single pipe size can have multiple bend radius values associated with it, as shown in the following example of a bend file. French: Vous pouvez avoir plusieurs entrées pour la même taille de tuyau dans le fichier de pliage. En d'autres termes, une même taille de tuyau peut être associée à plusieurs valeurs de rayon de pliage, comme dans le fichier de pliage d'exemple suivant.
  • 10. Issue: Inconsistent double quote usage Ainsi, si vous créez une pièce portant le nom &quot; bracket &quot; , elle est tout d'abord enregistrée dans le fichier {1}. For example, if you create a part with the name bracket, it initially saves to the file name {1}.
  • 11. Issue: Entity mismatch English: One way is to create a &amp;quot; flexible model. French: Une méthode consiste à créer un modèle souple.
  • 12. Issue: Punctuation mismatch (brace vs. dash) English: {1}Copy as Skeleton{2} ( the option cannot be changed ) to create a skeleton model. French: Cliquez sur {1}Copier en tant que squelette (Copy as Skeleton){2} - option non modifiable - pour créer un modèle squelette.
  • 13. Issue: Punctuation mismatch (dash vs. colon) English: {1}Additional Rotation{2} — Enter a real-number value for the number of degrees to rotate the spring's Y axis. French: {1}Rotation supplémentaire (Additional Rotation){2}  : entrez un nombre réel pour indiquer le nombre de degrés de rotation de l'axe Y du ressort.
  • 14. Issue: Capitalization mismatch English: Piping Master Catalog Directory File French: Fichier répertoire du catalogue principal de tuyauterie
  • 15. Issue: English UI strings in the translation English: Click View > Color and Appearance to create or modify colors. Cliquez sur Affichage ( View ) > Couleur et apparence ( Color and Appearance ) pour créer ou modifier les couleurs.
  • 16.
  • 17. Issue: Remove internal markup <tuv xml:lang=&quot;ZH-CN&quot;> <seg> <bpt i=&quot;1&quot;>&lt;span style='font-size:10.0pt; font-family:Verdana'></bpt> 在默认情况下,节点 <bpt i=&quot;2&quot; type=&quot;bold&quot;>&lt;b></bpt> 应用程序 <ept i=&quot;2&quot;>&lt;/b></ept> 之下没有任何应用程序,如下图所示。 <ept i=&quot;1&quot;>&lt;/span></ept> </seg></tuv></tu> Corrected: <tuv xml:lang=&quot;ZH-CN&quot;> <seg> 在默认情况下,节点应用程序之下没有任何应用程序,如下图所示。 </seg> </tuv> </tu>
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 32.
  • 33.
  • 34. GMS Integration with XLIFF Connector – Why is metadata so Important?
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40. Issue: English UI strings in the translation English: Click View > Color and Appearance to create or modify colors. Cliquez sur Affichage ( View ) > Couleur et apparence ( Color and Appearance ) pour créer ou modifier les couleurs.
  • 41.
  • 42. Intuitive contextual identification Any word that occurs as part of a context such as “show” in “show command,” remains in English per the UI, whereas the word command gets translated. In other contexts, both words, show and command, are translated as regular words.
  • 43.
  • 44.
  • 45.
  • 47.
  • 48.
  • 49. Cleaning issues 1/3 Issue Training Action Runtime action Excessive # of internal tags Remove segment Preserve and ignore Irrelevant data Fails in ratio filter Apply factoids Homonyms n/a Target language model Acronyms spelled out May be caught in word alignment Project dictionary # sentences mismatch Sentence break, align, discard n/a Inconsistent quote usage n/a Not handled Entities Unescape Unescape-reescape  XML-safe Punctuation mismatch n/a Needs special code (i.e. French “ :”)
  • 50. Cleaning issues 2/3 Issue Training Action Runtime action Capitalization mismatch ignored Apply language logic and target language model English UI strings Factoid Preprocess and escape Internal markup Escape to single tag Pass through Empty field Fails size delta filter Pass through Suspect character May fail language check Pass through HTML escapes Unescape Unescape – reescape for XML Trivial segment Fails length or ratio filter Pass through Missing punctuation none Apply language-appropriate punctuation
  • 51. Cleaning issues 3/3 Issue Training Action Runtime action Newline in string New sentence New sentence Program code Avoided or learned Needs markup Comparable data Currently fails length filter Research item Handled like “parallel”

Notas del editor

  1. Visit www.ptc.com Nov 4, 2009 © Copyright 2000 Parametric Technology Corporation Page
  2. Visit www.ptc.com Nov 4, 2009 © Copyright 2000 Parametric Technology Corporation Page - Punctuation marks vary between languages and sometimes even product versions. Some languages put a space between the word and the colon (:), some others don’t.
  3. Visit www.ptc.com Nov 4, 2009 © Copyright 2000 Parametric Technology Corporation Page - Punctuation marks vary between languages and sometimes even product versions. Some languages put a space between the word and the colon (:), some others don’t.
  4. Visit www.ptc.com Nov 4, 2009 © Copyright 2000 Parametric Technology Corporation Page - Punctuation marks vary between languages and sometimes even product versions. Some languages put a space between the word and the colon (:), some others don’t.
  5. Visit www.ptc.com Nov 4, 2009 © Copyright 2000 Parametric Technology Corporation Page - Punctuation marks vary between languages and sometimes even product versions. Some languages put a space between the word and the colon (:), some others don’t.
  6. Visit www.ptc.com Nov 4, 2009 © Copyright 2000 Parametric Technology Corporation Page - Punctuation marks vary between languages and sometimes even product versions. Some languages put a space between the word and the colon (:), some others don’t.
  7. Visit www.ptc.com Nov 4, 2009 © Copyright 2000 Parametric Technology Corporation Page - Punctuation marks vary between languages and sometimes even product versions. Some languages put a space between the word and the colon (:), some others don’t.
  8. Visit www.ptc.com Nov 4, 2009 © Copyright 2000 Parametric Technology Corporation Page - Punctuation marks vary between languages and sometimes even product versions. Some languages put a space between the word and the colon (:), some others don’t.
  9. Visit www.ptc.com Nov 4, 2009 © Copyright 2000 Parametric Technology Corporation Page - Punctuation marks vary between languages and sometimes even product versions. Some languages put a space between the word and the colon (:), some others don’t.
  10. Visit www.ptc.com Nov 4, 2009 © Copyright 2000 Parametric Technology Corporation Page - Punctuation marks vary between languages and sometimes even product versions. Some languages put a space between the word and the colon (:), some others don’t.
  11. Visit www.ptc.com Nov 4, 2009 © Copyright 2000 Parametric Technology Corporation Page - Punctuation marks vary between languages and sometimes even product versions. Some languages put a space between the word and the colon (:), some others don’t.
  12. Visit www.ptc.com Nov 4, 2009 © Copyright 2000 Parametric Technology Corporation Page - Punctuation marks vary between languages and sometimes even product versions. Some languages put a space between the word and the colon (:), some others don’t.
  13. Visit www.ptc.com Nov 4, 2009 © Copyright 2000 Parametric Technology Corporation Page
  14. Visit www.ptc.com Nov 4, 2009 © Copyright 2000 Parametric Technology Corporation Page - Punctuation marks vary between languages and sometimes even product versions. Some languages put a space between the word and the colon (:), some others don’t.