The document summarizes a presentation about the FALCON project, which aims to leverage linked data to enable controlled, decentralized sharing of language resources. This will allow small and medium localization companies to benefit from large-scale reuse of translation memories and term bases. The FALCON approach involves providing an open schema and SaaS platform to expose resources as linked data and enable stand-off annotation. It will also facilitate end-to-end process management and on-demand assembly of domain-specific machine translation training corpora.
Apidays New York 2024 - The value of a flexible API Management solution for O...
TaaS Workshop 2014, Active Terminology Prompting for SEO and Website Translation, Ioannis Iakovidis, Interverbum Technology
1. Wednesday,
4
June
/09:40
–
10:10
Ac5ve
Terminology
Promp5ng
for
SEO
and
Website
Transla5on
Ioannis
Iakovidis,
Interverbum
Technology
TaaS
Workshop
2014
4
June,
Dublin
(Ireland)
The
research
within
the
project
TaaS
leading
to
these
results
has
received
funding
from
the
European
Union
Seventh
Framework
Programme
(FP7/2007-‐2013),
grant
agreement
no
296312
3. In brief…
§ Interverbum Technology est. 1999
§ Today more than 80,000 licensed TermWeb
users
§ Some clients:
Bauer, Euroscript, Arancho, FujiXerox,
Medtronic, Novell, SAS Institute, Scania, SDI
Media, SIS (Swedish Standards Institute),
Xerox, VMware, Salesforce.com, Agilent, The
World Bank
§ Clients served from five offices in Europe, North
America and Asia
Terminology Management for Global Marketing
5. COLLABORATE
quick and easy grid-view editing
AUTOMATE
powerful customizable workflows
INTEGRATE
cross-system communication by TermWebIntegrator
EMBELLISH
full media support
EVOLVE
platform-independent, cloud-based, uniquely versatile
TermWeb 3.11
Brings easier collaboration
to the cloud
6. FALCON in a
nutshell
§ L10n is a Big Data Industry
§ Large-scale, Monetised Reuse of
Translation Memories and Term Bases
§ SMT and Text Analytics now also leverage
these resources
§ Large clients and LSPs curate and add
value to such resources as assets
§ FALCON aims to extend these benefits to
SMEs
Leverage the power of Linked Data for the Long Tail of the Localisation Industry
8. § Trinity College Dublin (IE)
LOD Mapping and Link Quality
Federated Access Control
L10n Interoperability
§ XTM International (UK)
CAT/L10n management vendor and interoperability
§ Dublin City University (IE)
SMT and text analytics
§ SKAWA Innovation (HU)
Web site translation (EasyLing), crowdsourcing
§ Interverbum Technology (SE)
Terminology Management
FALCON
Consortium
9. § Provide an Open Schema and SaaS platform for
exposing language resources as linked data
§ Enable controlled, decentralised sharing of
resources and stand-off value-add annotation
Term or named entity annotation
Translation process provenance and QA
§ Active Curation of resources and value add
§ End-to-end process management
§ On-demand assembly of domain specific LT
training corpora
FALCON
Approach
10. From: http://www.w3.org/TR/prov-primer/
ITS related entity subclass:
document, segment,
analysed-text, term,
translation, translation-revision
subproperty:
wasTranslatedFrom
W3C Provenance WG
http://www.w3.org/2011/prov/
Provenance-Oriented
Approach
11.
12. Language
Resource
LOD Store
Public
Language LOD
Resource
Curator
Multilingual Web
Management
(EasyLing)
Translation
Management
(XTM Cloud)
Machine
Translation
(Moses – DCU)
Terminology
Management
(TermWeb)
Client CMS
Text Analytics
(NER – DCU)
L3Data API L3Data API L3Data API L3Data API
UsersSystemsLinkedData
Localisation
Client
Project
Manager
Translators/
Posteditors
Translation
Reviewers
Terminologist
Source
doc
Target
Source
seg
Source
seg
Source
seg
Target
seg
Target
seg
Target
seg
bi-textbi-text
bi-text
bi-textbi-textML
terms
Project
TM
Project
term base
QA meta-
data
QA meta-
dataQA meta-
14. TB
TB
TB
TB
TB
TB
Local termbases with
SEO-scores are linked
via FALCON and
translated in TermWeb
The translated term
with the best SEO-
score will be returned to
the local termbase
FALCON CMS/SEO
CMS/SEO
TermWeb
15. § Language Resource Publishers can audit
links used in building other resources,
track ROI
§ Tool Vendors and Integrators expand
markets with more open asset
management offerings
§ SME LSPs gain resource sharing and
pooling opportunities that avoid lock-in
§ LSPs and clients can use Active Curation
to quickly train domain specific SMT and
text analytics components
Value Network of Benefits
16. Welcome to TermWeb
§ info@termweb.com
§ Interverbumtech.com
§ +46 13 32 98 40 (or any other office)
STOCKHOLM LINKÖPING CHICAGO BOISE SINGAPORE