Some useful tips for translators at absolutely no cost!
Learn how to create a corpus with BootCat Front End and analyze it with AntConc.
Learn how to extract changes using the free word add-in ExtractData by DocTools.
Learn how to search multiple PDFs at once using Acrobat Reader.
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Corpora, tracked changes, and PDFs: some useful tips, at no cost!
1. Corpora, Tracked Changes,
and PDFs
Some useful tips at no cost!
The Translation and Localization
Conference 2017
Patricia M. Ferreira Larrieux
EN <> ES <> IT Medical & Technical Translator
2. Agenda
About me
Purpose of this presentation
Working with corpora: your way to specialized
terminology
Extracting tracked changes
Searching in multiple PDF files at once
The Translation and Localization Conference 2017 2
3. About me
The Translation and Localization Conference 2017 3
Born in Uruguay, living in Italy since
1990
Degree in English<>Spanish Translation
Ran my own translation company for 7
years
10 years at Johnson & Johnson (2003-
2013)
May 2013: returned to freelancing
Currently freelance medical & technical
EN<>ES<>IT translator
+300K words translated in 2016
Member of: CTPU, ITI, ASETRAD,
TREMÉDICA, MET
4. Purpose of This Presentation
Sharing tips on:
Corpora – how to use BootCat & AntConc
Tracked changes – how to use DocTools
ExtractData
PDFs – searching multiple PDFs with Acrobat
Reader
Note: I am in no way connected with the
respective owners of these software programs!
The Translation and Localization Conference 2017 4
5. What is a corpus?
The Translation and Localization Conference 2017 5
6. What is a corpus?
The Translation and Localization Conference 2017 6
7. Why are Corpora useful for Translators?
They are a great resource for terminology and
phraseology.
Monolingual corpora in the target language have
proved to be an outstanding terminological tool for
specialized translation (Bowker, 1998)
The Translation and Localization Conference 2017 7
8. Online Corpora
The British National Corpus
http://corpus.byu.edu/bnc/
A collection of English corpora
http://corpus.leeds.ac.uk/protected/query.html
Michigan Corpus of Academic Spoken English
http://quod.lib.umich.edu/cgi/c/corpus/corpus?c=mic
ase;page=simple
The Translation and Localization Conference 2017 8
9. Online Corpora (cont’d)
Corpus de Referencia del Español Actual (CREA)
http://www.rae.es/recursos/banco-de-datos/crea
Corpora created by Mark Davies, Professor of
Linguistics at Brigham Young University.
http://corpus.byu.edu/corpora.asp
Paisà
http://www.corpusitaliano.it/
The Translation and Localization Conference 2017 9
10. Building Your Own Corpora:
BootCat Front End
The Translation and Localization Conference 2017 10
BootCat Front End is a free software developed by a group of
linguists from the Universities of Bologna (Forlì Campus),
Trento and Zagreb:
Marco Baroni (Trento) & Silvia Bernardini (Forlì) — wrote the
original scripts
Eros Zanchetta (Forlì) — wrote the BootCaT front-end and the
Bing URL collector, updated a few other scripts and maintains
this website
Nikola Ljubešić (Zagreb) — wrote the BootCaTExtractor
included since version 0.7 of the frontend and version 0.1.8 of
the toolkit.
Cyrus Shaoul (University of Alberta) — contributed the (now
retired) script to collect pages from Yahoo
11. Building Your Own Corpora:
BootCat Front End
The Translation and Localization Conference 2017 11
Download the app from this link:
http://bootcat.dipintra.it/?section=download
Get a Search Engine Key. See instructions here:
http://bit.ly/SearchEngineKey
Check the online Tutorial:
http://bit.ly/BC_Tutorial
14. AntConc: Exploring Your Corpus
The Translation and Localization Conference 2017 14
A free software developed by Dr. Laurence
Anthony, a Professor in the Faculty of Science
and Engineering at Waseda University, Japan. He
is a former director of the Center for English
Language Education (CELESE) and coordinator
of the CELESE technical English program.
15. AntConc: Exploring Your Corpus
The Translation and Localization Conference 2017 15
Download the app from this link:
http://www.laurenceanthony.net/software.html
Download the manual from this link:
http://bit.ly/AC_Manual
16. AntConc: Exploring Your Corpus – The
Concordance Window
The Translation and Localization Conference 2017 16
17. AntConc: Exploring Your Corpus – The
Collocates Window
The Translation and Localization Conference 2017 17
19. DocTools: Extracting Tracked Changes
The Translation and Localization Conference 2017 19
ExtractData: a free word add-in developed by Lene Fredborg
Some highlights from her website: https://wordaddins.com
Established DocTools in 2006.
+20 years working professionally with Word and programming
add-ins and macros in Visual Basic for Applications (VBA)
Developed several add-ins that can function as stand-alone
products
Via her website, she makes add-ins available to Word users in
general
Her motto: “Time-saving tools made for you!”
20. DocTools: Extracting Tracked Changes
The Translation and Localization Conference 2017 20
A Word add-in that works in Word 2007, Word
2010, Word 2013, and Word 2016 (Windows
only).
Send a request to get the app, and check the
installation instructions, from this link:
http://bit.ly/DocTools_Request
After installation, you will see a new «DocTools»
tab in Word
22. Acrobat Reader: Searching in multiple
PDFs
The Translation and Localization Conference 2017 22
Here’s how to proceed:
1) Save all the PDF files where you would like to
search in a single folder.
2) Open one file with Acrobat Reader.
3) Click «Shift+CTRL+F» or choose «Advanced
Search» from the Edit menu.
23. Acrobat Reader: The Advanced Search
Window
The Translation and Localization Conference 2017 23
4) Select “All PDF Documents in”.
5) Navigate to the folder where you
saved all your files (Step 1).
6) Type the word(s) to search for in the
search box.
24. Acrobat Reader: The Advanced Search
Window
24
7) When this window pops up, click “Allow”.
8) After a few seconds, your search results will display
in the advanced search window.
9) Click the plus sign (+) to see all results in each file.
10) Click on the result line to jump to the PDF
document.
The Translation and Localization Conference 2017
25. Acrobat Reader: The Advanced Search
Window
The Translation and Localization Conference 2017 25
9) Click the plus sign (+) to see all
results in each file.
10) Click on the result line to jump to
the PDF document.
26. Patricia María Ferreira Larrieux
E-mail: patricia.ferreira@language.proz.com
Website: www.pmferreira-larrieux.it
Linkedin profile: https://www.linkedin.com/in/pmferreiralarrieux/
ProZ profile: http://www.proz.com/profile/4437
Twitter: @PFerreiraLarr
The Translation and Localization Conference 2017 26
The World Wide Web is a mine of language data of unprecedented richness and ease of access.
It is also the only viable source of "disposable" corpora, built ad hoc for a specific purpose (e.g. a translation or interpreting task).
These corpora are essential resources for language professionals who routinely work with specialized languages, often in areas where neologisms and new terms are introduced at a fast pace and where standard reference corpora have to be complemented by easy-to-construct, focused, up-to-date text collections.
The World Wide Web is a mine of language data of unprecedented richness and ease of access.
It is also the only viable source of "disposable" corpora, built ad hoc for a specific purpose (e.g. a translation or interpreting task).
These corpora are essential resources for language professionals who routinely work with specialized languages, often in areas where neologisms and new terms are introduced at a fast pace and where standard reference corpora have to be complemented by easy-to-construct, focused, up-to-date text collections.
The World Wide Web is a mine of language data of unprecedented richness and ease of access.
It is also the only viable source of "disposable" corpora, built ad hoc for a specific purpose (e.g. a translation or interpreting task).
These corpora are essential resources for language professionals who routinely work with specialized languages, often in areas where neologisms and new terms are introduced at a fast pace and where standard reference corpora have to be complemented by easy-to-construct, focused, up-to-date text collections.
The World Wide Web is a mine of language data of unprecedented richness and ease of access.
It is also the only viable source of "disposable" corpora, built ad hoc for a specific purpose (e.g. a translation or interpreting task).
These corpora are essential resources for language professionals who routinely work with specialized languages, often in areas where neologisms and new terms are introduced at a fast pace and where standard reference corpora have to be complemented by easy-to-construct, focused, up-to-date text collections.
The World Wide Web is a mine of language data of unprecedented richness and ease of access.
It is also the only viable source of "disposable" corpora, built ad hoc for a specific purpose (e.g. a translation or interpreting task).
These corpora are essential resources for language professionals who routinely work with specialized languages, often in areas where neologisms and new terms are introduced at a fast pace and where standard reference corpora have to be complemented by easy-to-construct, focused, up-to-date text collections.
The World Wide Web is a mine of language data of unprecedented richness and ease of access.
It is also the only viable source of "disposable" corpora, built ad hoc for a specific purpose (e.g. a translation or interpreting task).
These corpora are essential resources for language professionals who routinely work with specialized languages, often in areas where neologisms and new terms are introduced at a fast pace and where standard reference corpora have to be complemented by easy-to-construct, focused, up-to-date text collections.
The World Wide Web is a mine of language data of unprecedented richness and ease of access.
It is also the only viable source of "disposable" corpora, built ad hoc for a specific purpose (e.g. a translation or interpreting task).
These corpora are essential resources for language professionals who routinely work with specialized languages, often in areas where neologisms and new terms are introduced at a fast pace and where standard reference corpora have to be complemented by easy-to-construct, focused, up-to-date text collections.
The World Wide Web is a mine of language data of unprecedented richness and ease of access.
It is also the only viable source of "disposable" corpora, built ad hoc for a specific purpose (e.g. a translation or interpreting task).
These corpora are essential resources for language professionals who routinely work with specialized languages, often in areas where neologisms and new terms are introduced at a fast pace and where standard reference corpora have to be complemented by easy-to-construct, focused, up-to-date text collections.
The World Wide Web is a mine of language data of unprecedented richness and ease of access.
It is also the only viable source of "disposable" corpora, built ad hoc for a specific purpose (e.g. a translation or interpreting task).
These corpora are essential resources for language professionals who routinely work with specialized languages, often in areas where neologisms and new terms are introduced at a fast pace and where standard reference corpora have to be complemented by easy-to-construct, focused, up-to-date text collections.