SlideShare a Scribd company logo
1 of 21
Download to read offline
Curation Technologies 

for Multilingual Europe
Georg Rehm
DFKI, Germany
META-FORUM 2016 –  Lisbon, Portugal – 04/05 July 2016
Information
Information
Information
Information
Information
Information
Information
Information
Information
? ??
?Information
OutputInput SoftwareProcesses
Curation Technologies for Multilingual Europe
•  Author
•  Scholar
•  TV editor
•  Researcher
•  Knowledge worker
•  Investigative journalist
•  Designer of an exhibition
•  Curator of digital information
Sectors
Input Processes Software Output
tweet analyse text processor newspaper article
newspaper article select presentation multimedia website
wire copy focus spreadsheet tv report
facebook status update revise email exhibition catalogue
search result read up on browser mobile application
email write groupware mashup (e.g., map)
text message create sector-specific application text piece
concept research CMS concept
text file assess ECMS timeline
video evaluate CRM study
map arrange enterprise software presentation
stockphoto sort graphics/layouting software fact collection
in-house database structure IP telephony description of an exhibit
calendar entry summarise etc. analysis
spreadsheet shorten etc.
archive translate
etc. catch up on
combine
abstract
integrate
visualise
generate
annotate
reference
etc.
Information
Information
Information
Information
Information
Information
Information
Information
Information
? ??
?Information
OutputInput SoftwareProcesses
Sectors
Input Processes Software Output
tweet analyse text processor newspaper article
newspaper article select presentation multimedia website
wire copy focus spreadsheet tv report
facebook status update revise email exhibition catalogue
search result read up on browser mobile application
email write groupware mashup (e.g., map)
text message create sector-specific application text piece
concept research CMS concept
text file assess ECMS timeline
video evaluate CRM study
map arrange enterprise software presentation
stockphoto sort graphics/layouting software fact collection
in-house database structure IP telephony description of an exhibit
calendar entry summarise etc. analysis
spreadsheet shorten etc.
archive translate
etc. catch up on
combine
abstract
integrate
visualise
generate
annotate
reference
etc.
Information
Information
Information
Information
Information
Information
Information
Information
Information
? ??
?Information
OutputInput SoftwareProcesses
Sectors
Input Processes Software Output
tweet analyse text processor newspaper article
newspaper article select presentation multimedia website
wire copy focus spreadsheet tv report
facebook status update revise email exhibition catalogue
search result read up on browser mobile application
email write groupware mashup (e.g., map)
text message create sector-specific application text piece
concept research CMS concept
text file assess ECMS timeline
video evaluate CRM study
map arrange enterprise software presentation
stockphoto sort graphics/layouting software fact collection
in-house database structure IP telephony description of an exhibit
calendar entry summarise etc. analysis
spreadsheet shorten etc.
archive translate
etc. catch up on
combine
abstract
integrate
visualise
generate
annotate
reference
etc.
Information
Information
Information
Information
Information
Information
Information
Information
Information
? ??
?Information
OutputInput SoftwareProcesses
language and knowledge technologies
curation technologies
sector-specific technologies
platformtechnologies
sector-specific solutions
!
Digital Curation Technologies
•  Make curation processes in four SMEs (and sectors) more
efficient through language and knowledge technologies.
•  Technology transfer project to arrive at proofs of concept.
•  Curation services for real companies and real use cases.
•  The human expert/curator is always in the centre and loop.
•  Platform for digital curation technologies: innovation boost.
Curation Technologies for Multilingual Europe
Curation Technologies for Multilingual Europe
CurationDashboard
Structure visualisation
Multilingual multimedia sources
Crossmedia recommendations
Multilingual summarisation
Event timelining
Semantification of content
Multilingual sentiment analysis
Semantic storytelling
Ontology-based knowledge structures
Automatic hyperlinking of document collections
Curation Processes Processing, exploration and 

re-aggregation of domain- and task-
specific document collections.
Key Characteristics
•  Technology transfer and integration project
•  Broad set of tools and technologies
•  Focus on building proofs of concept
•  Our technologies don’t have to be perfect
•  Human expert, i.e., the curator, always in the loop
•  Important for all SME partners: domain-adaptability.
•  WPs: Semantic Analysis, Semantic Generation,
Multilingual Technologies, Integration into Curation Tech
Curation Technologies for Multilingual Europe
platform for digital curation technologies
broker REST API
curation service 1
language or knowledge
technology
curation service 2
language or knowledge
technology
client using 

the API
external
service 1
external
service 2
client using 

the API
client using 

the API
client using 

the API
pipelined curation workflow
Curation Technologies for Multilingual Europe
•  Curation process: e-service available through REST API.
•  Services can be combined to form pipelines or workflows.
•  Domain-adaptability: every curation process has a training API to create
and use domain-specific models.
Current Results
•  Implemented the following baseline services:
–  NER – e-entityrecognition e-service
–  Geolocation – e-entityrecognition and visualisation
–  Temporal Analyser – e-entityrecognition and visualisation
–  Classification – e-classification e-service
–  Clustering – e-clustering e-service
–  Machine Translation – e-translation e-service
•  Curation Dashboard (first prototype)
•  Semantic Storytelling (work in progress)
Curation Technologies for Multilingual Europe
NER, Entity Linking, Geolocation
Curation Technologies for Multilingual Europe
...
In the Viking colony of Iceland,
an extraordinary vernacular
literature blossomed in the 12th
through 14th centuries
...
...

The ships were scuttled there
in the 11th century, to block a

navigation channel and thus 

protect Roskilde, then 

Copenhagen from seaborne
assault

...
...

Viking Age inscriptions have 

also been discovered on the 

Manx runestones on the 

Isle of Man.

…
Plain Text NIF enrichment visualisation
http://api.digitale-kuratierung.de/api/e-nlp/namedEntityRecognition?analysis=ner http://http://dev.digitale-kuratierung.de/admini/pages/geolocalization.php
•  Currently based on OpenNLP (with NIF integration)
•  Mode 1: model-based (for domains where annotated
data is available)
•  Mode 2: dictionary-based (for domains where only a
list of names is available)
•  Entity Linking through SPARQL queries to DBPedia
•  For locations, GPS-coordinates are retrieved,
document level average and standard deviation (over
all locations) are calculated to visualise positioning of
documents on a map.
Curation Technologies for Multilingual Europe
NER Training
http://api.digitale-kuratierung.de/api/e-nlp/trainModel?analysis=dict 

(in the suboptimal case that only a list of terms and their URIs in an
ontology is available)

http://api.digitale-kuratierung.de/api/e-nlp/trainModel?analysis=ner

(if annotated training data is available)

directly usable on new input
NER model
Curation Technologies for Multilingual Europe
Temporal Analysis
...

The ships were scuttled there
in the 11th century, to block a

navigation channel and thus 

protect Roskilde, then 

Copenhagen from seaborne
assault

...
...

Viking Age inscriptions have 

also been discovered on the 

Manx runestones on the 

Isle of Man.

...
...
In the Viking colony of Iceland,
an extraordinary vernacular
literature blossomed in the 12th
through 14th centuries
…
900
1600
http://api.digitale-kuratierung.de/api/e-nlp/namedEntityRecognition?analysis=temp
http://dev.digitale-kuratierung.de/admini/pages/timelining.php
Plain Text NIF enrichment visualisation
•  Sort and rank documents from a
collection on chronological scale.
•  Developed rule-based system due
to our focus in terms of languages
(EN, DE), domain adaptability,
normalisation requirements.
•  Analysis of temporal expressions
in a document (or, later,
paragraphs or even sentences).
•  Compute mean value for date and
time, allowing positioning on a
timeline.
•  Future plans: adaptability through
user-specific rules.
•  Related work: SUTime,
HeidelTime, Tango, Tarsgi; many
papers at LREC 2016
Classification
•  Mallet – Maximum Entropy Algorithm
•  Algorithm for text classification, easy integration.
•  Goal: text classification, i.e., assign a topic (class) to a
document (or parts of a document) to apply domain- or topic-
specific NLP processing techniques.
•  Future plans: improvement of classification schema by means
of new training data and additional algorithms.
Curation Technologies for Multilingual Europe
@prefix rdf:    <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .
@prefix nif:    <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .


<http://dkt.dfki.de/documents/#char=0,1257>
a nif:RFC5147String , nif:String , nif:Context ;
nif:beginIndex "0"^^xsd:nonNegativeInteger ;
nif:endIndex "1257"^^xsd:nonNegativeInteger ;
       nif:documentClassificationLabel "Frühjahrsoffensive_1918"^^xsd:string ;
nif:isString "Ceylon-Teestube B. Walther München Maximilian-Strasse 44 Gegenüber dem Königl. Hoftheater
Telephon 428 München, den 26.XI.13. Von hier nach Dresden ab München 8.25 9.00 10.20 an Dresden 7.28 10.47 9.48 Sie
müssen unbedingt Donnerstag hier bleiben. So können Sie doch nicht vorbeifahren. Donnerstag Abend eine interessante
Uraufführung in den Kammerspielen "unseligen Gedenkens " Ich werde Billets dafür besorgen. […]"^^xsd:string .
Clustering
•  WEKA (Expectation Maximisation algorithm)
•  Easy integration, availability, additional algorithms.
•  Goal: identification of distinct features of document collections.
•  Example use case: a user has to prepare a museum exhibit on
“Birds”. Knowing which documents can be grouped can be useful to
split the documents into exhibition rooms.
•  Future plans: allow users to easily recognize groups of documents in
new domains and collections; faceted search.
Curation Technologies for Multilingual Europe
ARFF Input JSON Output
@RELATION iris
@ATTRIBUTE sepallength  NUMERIC
@ATTRIBUTE sepalwidth   NUMERIC
@ATTRIBUTE petallength  NUMERIC
@ATTRIBUTE petalwidth   NUMERIC


@DATA
5.1,3.5,1.4,0.2
4.9,3.0,1.4,0.2
4.7,3.2,1.3,0.2
4.6,3.1,1.5,0.2
5.0,3.6,1.4,0.2
5.4,3.9,1.7,0.4
4.6,3.4,1.4,0.3
5.0,3.4,1.5,0.2
4.4,2.9,1.4,0.2
4.9,3.1,1.5,0.1
{
"results": {
"numberClusters": -1,
"clusters": {"cluster1": {
  "clusterId": 1,
"entitites": {
   "entity1": {
    "meanValue": 3.3099999999999996,
    "label": "sepalwidth"
  },
  "entity2": {
    "meanValue": 1.45,
    "label": "petallength"
   },
  "entity3": {
    "meanValue": 0.22000000000000003,
    "label": "petalwidth"
   }
}
}}}}
Machine Translation
Curation Technologies for Multilingual Europe
Workflow
Language &
Translation
Models trained
on DGT, News,
Europarl, TED
Herr Modi befindet sich auf einer fünftägigen
Reise nach Japan, um die wirtschaftlichen
Beziehungen mit der drittgrößten
Wirtschaftsnation der Welt zu festigen.
Mr Modi is located on a five-day trip to Japan to
strengthen the economic ties with the third largest
economy in the world.
Named Entity
Recognition
Entity Linking
Temporal
Expressions
Metadata
Processing
Post-Edit
Retraining
Example
•  Robust, adaptable and customised models of MT as e-services (Moses-based SMT)
•  Scenarios: museums, showrooms; news, media; publishers; cultural institutions, archives
•  Integration in curation workflows with other DKT services (NER, Temporal Analyser)
•  Plug-in multiple knowledge sources (Linked Data)
Semantic Storytelling
•  Important objective for all partner use cases: Automatic
hyper-linking of task-specific, self-contained collections.
•  Input: coherent, self-contained document collection
•  Output: processed collection with added analysis information,
easily accessible as a hypertext, for efficient browsing
•  Semantic Storytelling – operates on the hypertext graph that
we construct on top of the original collection
•  Enables multiple different paths through the collection
•  Semantic storytelling is the identification, ranking and
recommendation of meaningful hypertext paths.
Curation Technologies for Multilingual Europe
Curation Technologies for Multilingual Europe
<http://d-nb.info/gnd/11858071X, met, http://d-nb.info/gnd/129094722>
http://dev.digitale-kuratierung.de/2ds3/index.php
<http://d-nb.info/gnd/118589768, wrote, http://d-nb.info/gnd/118623230>
<http://d-nb.info/gnd/123242231, visited, http://d-nb.info/gnd/188402519>
<http://d-nb.info/gnd/118569015, said, http://d-nb.info/gnd/11947509X>
<http://d-nb.info/gnd/119173425, was, http://d-nb.info/gnd/118629867>
<http://d-nb.info/gnd/119178893, designed, http://d-<nb.info/gnd/118629867>
<http://d-nb.info/gnd/118876759, love, http://d-nb.info/gnd/118629867>
<http://d-nb.info/gnd/118545892, depart, http://d-nb.info/gnd/107363569>
<http://d-nb.info/gnd/128830751, write, http://d-nb.info/gnd/118606026>
<http://d-nb.info/gnd/11858071X, protect, http://d-nb.info/gnd/39650438>
<http://d-nb.info/gnd/116713704, married, http://d-nb.info/gnd/52754181>
…
1
2
3
45
Curation Technologies for Multilingual Europe
Curation Dashboard
Conclusions
•  Curation technologies are smart technologies to support
knowledge workers handling content and knowledge.
•  The multilingual Digital Single Market will create a
massive need for multilingual Curation Technologies due
to an ever-increasing need for multilingual content.
•  DKT is mostly centred around German and English.
•  We cater for a small set of curation processes.
•  To be extended in a larger follow-up project.
•  Extended set of curation processes, more complex
approaches, many more languages.
Curation Technologies for Multilingual Europe
Thank you!
supported by

More Related Content

Viewers also liked

Grupo 3 a tejón ( Meles meles)
Grupo 3 a  tejón ( Meles meles)Grupo 3 a  tejón ( Meles meles)
Grupo 3 a tejón ( Meles meles)raquelgmur
 
techniques of propaganda
techniques of propagandatechniques of propaganda
techniques of propagandajennifer joe
 
Formación del caracter del niño
Formación del caracter del niñoFormación del caracter del niño
Formación del caracter del niñoliasfe
 
Tema 8.La revolución rusa y la URSS
Tema 8.La  revolución  rusa  y  la  URSSTema 8.La  revolución  rusa  y  la  URSS
Tema 8.La revolución rusa y la URSSsocialestolosa
 
Tutorial Timetoast
Tutorial TimetoastTutorial Timetoast
Tutorial TimetoastCarlos Diez
 
5 tipos de cenas para despedidas de soltera en Salamanca
5 tipos de cenas para despedidas de soltera en Salamanca5 tipos de cenas para despedidas de soltera en Salamanca
5 tipos de cenas para despedidas de soltera en SalamancaDespedidasdeSolteraenSalamanca
 
Sistemas Auxiliares Motor de Combustión Interna
Sistemas Auxiliares Motor de Combustión InternaSistemas Auxiliares Motor de Combustión Interna
Sistemas Auxiliares Motor de Combustión InternaMateoLeonidez
 
Oil & Gas Magazine Abril de 2015
Oil & Gas Magazine Abril de 2015Oil & Gas Magazine Abril de 2015
Oil & Gas Magazine Abril de 2015Oil & Gas Magazine
 

Viewers also liked (16)

Pka praesentation 14_03_2011
Pka praesentation 14_03_2011Pka praesentation 14_03_2011
Pka praesentation 14_03_2011
 
全新的Qt5
全新的Qt5全新的Qt5
全新的Qt5
 
Grupo 3 a tejón ( Meles meles)
Grupo 3 a  tejón ( Meles meles)Grupo 3 a  tejón ( Meles meles)
Grupo 3 a tejón ( Meles meles)
 
Estandades en el aula
Estandades en el aulaEstandades en el aula
Estandades en el aula
 
Els patricis ppt
Els patricis pptEls patricis ppt
Els patricis ppt
 
techniques of propaganda
techniques of propagandatechniques of propaganda
techniques of propaganda
 
Curso Spring Roo Spring Data Jpa Maven
Curso Spring Roo Spring Data Jpa MavenCurso Spring Roo Spring Data Jpa Maven
Curso Spring Roo Spring Data Jpa Maven
 
Formación del caracter del niño
Formación del caracter del niñoFormación del caracter del niño
Formación del caracter del niño
 
Tema 8.La revolución rusa y la URSS
Tema 8.La  revolución  rusa  y  la  URSSTema 8.La  revolución  rusa  y  la  URSS
Tema 8.La revolución rusa y la URSS
 
Hipérbola
Hipérbola Hipérbola
Hipérbola
 
EL PAPIRO
EL PAPIROEL PAPIRO
EL PAPIRO
 
Tutorial Timetoast
Tutorial TimetoastTutorial Timetoast
Tutorial Timetoast
 
5 tipos de cenas para despedidas de soltera en Salamanca
5 tipos de cenas para despedidas de soltera en Salamanca5 tipos de cenas para despedidas de soltera en Salamanca
5 tipos de cenas para despedidas de soltera en Salamanca
 
Sistemas Auxiliares Motor de Combustión Interna
Sistemas Auxiliares Motor de Combustión InternaSistemas Auxiliares Motor de Combustión Interna
Sistemas Auxiliares Motor de Combustión Interna
 
Cirquit mixt
Cirquit mixtCirquit mixt
Cirquit mixt
 
Oil & Gas Magazine Abril de 2015
Oil & Gas Magazine Abril de 2015Oil & Gas Magazine Abril de 2015
Oil & Gas Magazine Abril de 2015
 

Similar to Curation Technologies for Multilingual Europe

Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?Georg Rehm
 
ResearchSpace- Example of a VRE Based on CIDOC CRM
ResearchSpace- Example of a VRE Based on CIDOC CRMResearchSpace- Example of a VRE Based on CIDOC CRM
ResearchSpace- Example of a VRE Based on CIDOC CRMVladimir Alexiev, PhD, PMP
 
Lynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The Services
Lynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The ServicesLynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The Services
Lynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The ServicesLynx Project
 
Impact Centre of Competence presentation at CERL 2014 by Tomasz Parkola (PSNC)
Impact Centre of Competence presentation at CERL 2014 by Tomasz Parkola (PSNC)Impact Centre of Competence presentation at CERL 2014 by Tomasz Parkola (PSNC)
Impact Centre of Competence presentation at CERL 2014 by Tomasz Parkola (PSNC)IMPACT Centre of Competence
 
Designing a multilingual knowledge graph - DCMI2018
Designing a multilingual knowledge graph - DCMI2018Designing a multilingual knowledge graph - DCMI2018
Designing a multilingual knowledge graph - DCMI2018Antoine Isaac
 
LoCloud - Local content in a Europeana cloud
LoCloud - Local content in a Europeana cloudLoCloud - Local content in a Europeana cloud
LoCloud - Local content in a Europeana cloudEuropeana
 
General ea en short
General ea en shortGeneral ea en short
General ea en shortMMI Group
 
publishing production
publishing productionpublishing production
publishing productionEssam Obaid
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenVladimir Alexiev, PhD, PMP
 
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...The Research Council of Norway, IKTPLUSS
 
eScriptorium: An Open Source Platform for Historical Document Analysis
eScriptorium: An Open Source Platform for Historical Document AnalysiseScriptorium: An Open Source Platform for Historical Document Analysis
eScriptorium: An Open Source Platform for Historical Document AnalysisEquipex Biblissima
 
Lynx project presentation at ENDORSE 2021 Conference
Lynx project presentation at ENDORSE 2021 ConferenceLynx project presentation at ENDORSE 2021 Conference
Lynx project presentation at ENDORSE 2021 ConferenceLynx Project
 
Local content in a Europeana cloud for small & medium content providers
Local content in a Europeana cloud for small & medium content providersLocal content in a Europeana cloud for small & medium content providers
Local content in a Europeana cloud for small & medium content providerslocloud
 
IIIF: International Image Interoperability Framework @ DLF2012
IIIF: International Image Interoperability Framework @ DLF2012IIIF: International Image Interoperability Framework @ DLF2012
IIIF: International Image Interoperability Framework @ DLF2012Tom-Cramer
 
Smart Content - FREME Project - Presentation Frankfurt Book Fair
Smart Content - FREME Project - Presentation Frankfurt Book FairSmart Content - FREME Project - Presentation Frankfurt Book Fair
Smart Content - FREME Project - Presentation Frankfurt Book FairKevin Koidl
 
EDF2013: Selected talk by David Lewis: Linked Data Reuse in the Language Serv...
EDF2013: Selected talk by David Lewis: Linked Data Reuse in the Language Serv...EDF2013: Selected talk by David Lewis: Linked Data Reuse in the Language Serv...
EDF2013: Selected talk by David Lewis: Linked Data Reuse in the Language Serv...European Data Forum
 

Similar to Curation Technologies for Multilingual Europe (20)

Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?
 
ResearchSpace- Example of a VRE Based on CIDOC CRM
ResearchSpace- Example of a VRE Based on CIDOC CRMResearchSpace- Example of a VRE Based on CIDOC CRM
ResearchSpace- Example of a VRE Based on CIDOC CRM
 
Lynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The Services
Lynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The ServicesLynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The Services
Lynx Webinar #4: Lynx Services Platform (LySP) - Part 2 - The Services
 
Impact Centre of Competence presentation at CERL 2014 by Tomasz Parkola (PSNC)
Impact Centre of Competence presentation at CERL 2014 by Tomasz Parkola (PSNC)Impact Centre of Competence presentation at CERL 2014 by Tomasz Parkola (PSNC)
Impact Centre of Competence presentation at CERL 2014 by Tomasz Parkola (PSNC)
 
New Goals of PARES: Spanish Archives Web Portal
New Goals of PARES: Spanish Archives Web PortalNew Goals of PARES: Spanish Archives Web Portal
New Goals of PARES: Spanish Archives Web Portal
 
Session5 03.george rehm
Session5 03.george rehmSession5 03.george rehm
Session5 03.george rehm
 
Designing a multilingual knowledge graph - DCMI2018
Designing a multilingual knowledge graph - DCMI2018Designing a multilingual knowledge graph - DCMI2018
Designing a multilingual knowledge graph - DCMI2018
 
LoCloud - Local content in a Europeana cloud
LoCloud - Local content in a Europeana cloudLoCloud - Local content in a Europeana cloud
LoCloud - Local content in a Europeana cloud
 
General ea en short
General ea en shortGeneral ea en short
General ea en short
 
publishing production
publishing productionpublishing production
publishing production
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
 
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
SESAM4 - A guide to semantics in the Linked Open Data cloud, Robert HP Engels...
 
eScriptorium: An Open Source Platform for Historical Document Analysis
eScriptorium: An Open Source Platform for Historical Document AnalysiseScriptorium: An Open Source Platform for Historical Document Analysis
eScriptorium: An Open Source Platform for Historical Document Analysis
 
Lynx project presentation at ENDORSE 2021 Conference
Lynx project presentation at ENDORSE 2021 ConferenceLynx project presentation at ENDORSE 2021 Conference
Lynx project presentation at ENDORSE 2021 Conference
 
Local content in a Europeana cloud for small & medium content providers
Local content in a Europeana cloud for small & medium content providersLocal content in a Europeana cloud for small & medium content providers
Local content in a Europeana cloud for small & medium content providers
 
Ppt congreso bbva iatext 2018 final
Ppt congreso bbva iatext 2018 finalPpt congreso bbva iatext 2018 final
Ppt congreso bbva iatext 2018 final
 
IIIF: International Image Interoperability Framework @ DLF2012
IIIF: International Image Interoperability Framework @ DLF2012IIIF: International Image Interoperability Framework @ DLF2012
IIIF: International Image Interoperability Framework @ DLF2012
 
Smart Content - FREME Project - Presentation Frankfurt Book Fair
Smart Content - FREME Project - Presentation Frankfurt Book FairSmart Content - FREME Project - Presentation Frankfurt Book Fair
Smart Content - FREME Project - Presentation Frankfurt Book Fair
 
EDF2013: Selected talk by David Lewis: Linked Data Reuse in the Language Serv...
EDF2013: Selected talk by David Lewis: Linked Data Reuse in the Language Serv...EDF2013: Selected talk by David Lewis: Linked Data Reuse in the Language Serv...
EDF2013: Selected talk by David Lewis: Linked Data Reuse in the Language Serv...
 
Wroclaw university library - Grazyna Piotrowicz
Wroclaw university library - Grazyna PiotrowiczWroclaw university library - Grazyna Piotrowicz
Wroclaw university library - Grazyna Piotrowicz
 

More from Georg Rehm

QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...Georg Rehm
 
Observations on Annotations – From Computational Linguistics and the World Wi...
Observations on Annotations – From Computational Linguistics and the World Wi...Observations on Annotations – From Computational Linguistics and the World Wi...
Observations on Annotations – From Computational Linguistics and the World Wi...Georg Rehm
 
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...Georg Rehm
 
AI and Conference Interpretation – From Smart Assistants for the Human Interp...
AI and Conference Interpretation – From Smart Assistants for the Human Interp...AI and Conference Interpretation – From Smart Assistants for the Human Interp...
AI and Conference Interpretation – From Smart Assistants for the Human Interp...Georg Rehm
 
Künstliche Intelligenz beim Dolmetschen und Übersetzen
Künstliche Intelligenz beim Dolmetschen und ÜbersetzenKünstliche Intelligenz beim Dolmetschen und Übersetzen
Künstliche Intelligenz beim Dolmetschen und ÜbersetzenGeorg Rehm
 
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...Georg Rehm
 
European Language Technologies – Past, Present and Future
European Language Technologies – Past, Present and FutureEuropean Language Technologies – Past, Present and Future
European Language Technologies – Past, Present and FutureGeorg Rehm
 
Towards a Human Language Project for Multilingual Europe: AI and Interpretation
Towards a Human Language Project for Multilingual Europe: AI and InterpretationTowards a Human Language Project for Multilingual Europe: AI and Interpretation
Towards a Human Language Project for Multilingual Europe: AI and InterpretationGeorg Rehm
 
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) ÜberblickKI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) ÜberblickGeorg Rehm
 
Language Technologies for Multilingual Europe - Towards a Human Language Proj...
Language Technologies for Multilingual Europe - Towards a Human Language Proj...Language Technologies for Multilingual Europe - Towards a Human Language Proj...
Language Technologies for Multilingual Europe - Towards a Human Language Proj...Georg Rehm
 
AI for Translation Technologies and Multilingual Europe
AI for Translation Technologies and Multilingual EuropeAI for Translation Technologies and Multilingual Europe
AI for Translation Technologies and Multilingual EuropeGeorg Rehm
 
Kuratieren im Zeitalter der KI
Kuratieren im Zeitalter der KIKuratieren im Zeitalter der KI
Kuratieren im Zeitalter der KIGeorg Rehm
 
Artificial Intelligence for the Film Industry
Artificial Intelligence for the Film IndustryArtificial Intelligence for the Film Industry
Artificial Intelligence for the Film IndustryGeorg Rehm
 
KI für die Kundenkommunikation
KI für die KundenkommunikationKI für die Kundenkommunikation
KI für die KundenkommunikationGeorg Rehm
 
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...Georg Rehm
 
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen BibliothekenDigitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen BibliothekenGeorg Rehm
 
EPUB, quo vadis? Publishing im W3C
EPUB, quo vadis? Publishing im W3CEPUB, quo vadis? Publishing im W3C
EPUB, quo vadis? Publishing im W3CGeorg Rehm
 
Human Language Technologies in a Multilingual Europe
Human Language Technologies in a Multilingual EuropeHuman Language Technologies in a Multilingual Europe
Human Language Technologies in a Multilingual EuropeGeorg Rehm
 
Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...
Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...
Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...Georg Rehm
 
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...Georg Rehm
 

More from Georg Rehm (20)

QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
 
Observations on Annotations – From Computational Linguistics and the World Wi...
Observations on Annotations – From Computational Linguistics and the World Wi...Observations on Annotations – From Computational Linguistics and the World Wi...
Observations on Annotations – From Computational Linguistics and the World Wi...
 
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
 
AI and Conference Interpretation – From Smart Assistants for the Human Interp...
AI and Conference Interpretation – From Smart Assistants for the Human Interp...AI and Conference Interpretation – From Smart Assistants for the Human Interp...
AI and Conference Interpretation – From Smart Assistants for the Human Interp...
 
Künstliche Intelligenz beim Dolmetschen und Übersetzen
Künstliche Intelligenz beim Dolmetschen und ÜbersetzenKünstliche Intelligenz beim Dolmetschen und Übersetzen
Künstliche Intelligenz beim Dolmetschen und Übersetzen
 
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
 
European Language Technologies – Past, Present and Future
European Language Technologies – Past, Present and FutureEuropean Language Technologies – Past, Present and Future
European Language Technologies – Past, Present and Future
 
Towards a Human Language Project for Multilingual Europe: AI and Interpretation
Towards a Human Language Project for Multilingual Europe: AI and InterpretationTowards a Human Language Project for Multilingual Europe: AI and Interpretation
Towards a Human Language Project for Multilingual Europe: AI and Interpretation
 
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) ÜberblickKI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
 
Language Technologies for Multilingual Europe - Towards a Human Language Proj...
Language Technologies for Multilingual Europe - Towards a Human Language Proj...Language Technologies for Multilingual Europe - Towards a Human Language Proj...
Language Technologies for Multilingual Europe - Towards a Human Language Proj...
 
AI for Translation Technologies and Multilingual Europe
AI for Translation Technologies and Multilingual EuropeAI for Translation Technologies and Multilingual Europe
AI for Translation Technologies and Multilingual Europe
 
Kuratieren im Zeitalter der KI
Kuratieren im Zeitalter der KIKuratieren im Zeitalter der KI
Kuratieren im Zeitalter der KI
 
Artificial Intelligence for the Film Industry
Artificial Intelligence for the Film IndustryArtificial Intelligence for the Film Industry
Artificial Intelligence for the Film Industry
 
KI für die Kundenkommunikation
KI für die KundenkommunikationKI für die Kundenkommunikation
KI für die Kundenkommunikation
 
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
 
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen BibliothekenDigitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
 
EPUB, quo vadis? Publishing im W3C
EPUB, quo vadis? Publishing im W3CEPUB, quo vadis? Publishing im W3C
EPUB, quo vadis? Publishing im W3C
 
Human Language Technologies in a Multilingual Europe
Human Language Technologies in a Multilingual EuropeHuman Language Technologies in a Multilingual Europe
Human Language Technologies in a Multilingual Europe
 
Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...
Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...
Language Technologies for Big Data – A Strategic Agenda for the Multilingual ...
 
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
Multilingual Europe in late 2016 – A Strategic Research and Innovation Agenda...
 

Recently uploaded

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 

Recently uploaded (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 

Curation Technologies for Multilingual Europe

  • 1. Curation Technologies 
 for Multilingual Europe Georg Rehm DFKI, Germany META-FORUM 2016 –  Lisbon, Portugal – 04/05 July 2016
  • 2. Information Information Information Information Information Information Information Information Information ? ?? ?Information OutputInput SoftwareProcesses Curation Technologies for Multilingual Europe •  Author •  Scholar •  TV editor •  Researcher •  Knowledge worker •  Investigative journalist •  Designer of an exhibition •  Curator of digital information
  • 3. Sectors Input Processes Software Output tweet analyse text processor newspaper article newspaper article select presentation multimedia website wire copy focus spreadsheet tv report facebook status update revise email exhibition catalogue search result read up on browser mobile application email write groupware mashup (e.g., map) text message create sector-specific application text piece concept research CMS concept text file assess ECMS timeline video evaluate CRM study map arrange enterprise software presentation stockphoto sort graphics/layouting software fact collection in-house database structure IP telephony description of an exhibit calendar entry summarise etc. analysis spreadsheet shorten etc. archive translate etc. catch up on combine abstract integrate visualise generate annotate reference etc. Information Information Information Information Information Information Information Information Information ? ?? ?Information OutputInput SoftwareProcesses
  • 4. Sectors Input Processes Software Output tweet analyse text processor newspaper article newspaper article select presentation multimedia website wire copy focus spreadsheet tv report facebook status update revise email exhibition catalogue search result read up on browser mobile application email write groupware mashup (e.g., map) text message create sector-specific application text piece concept research CMS concept text file assess ECMS timeline video evaluate CRM study map arrange enterprise software presentation stockphoto sort graphics/layouting software fact collection in-house database structure IP telephony description of an exhibit calendar entry summarise etc. analysis spreadsheet shorten etc. archive translate etc. catch up on combine abstract integrate visualise generate annotate reference etc. Information Information Information Information Information Information Information Information Information ? ?? ?Information OutputInput SoftwareProcesses
  • 5. Sectors Input Processes Software Output tweet analyse text processor newspaper article newspaper article select presentation multimedia website wire copy focus spreadsheet tv report facebook status update revise email exhibition catalogue search result read up on browser mobile application email write groupware mashup (e.g., map) text message create sector-specific application text piece concept research CMS concept text file assess ECMS timeline video evaluate CRM study map arrange enterprise software presentation stockphoto sort graphics/layouting software fact collection in-house database structure IP telephony description of an exhibit calendar entry summarise etc. analysis spreadsheet shorten etc. archive translate etc. catch up on combine abstract integrate visualise generate annotate reference etc. Information Information Information Information Information Information Information Information Information ? ?? ?Information OutputInput SoftwareProcesses
  • 6. language and knowledge technologies curation technologies sector-specific technologies platformtechnologies sector-specific solutions ! Digital Curation Technologies •  Make curation processes in four SMEs (and sectors) more efficient through language and knowledge technologies. •  Technology transfer project to arrive at proofs of concept. •  Curation services for real companies and real use cases. •  The human expert/curator is always in the centre and loop. •  Platform for digital curation technologies: innovation boost. Curation Technologies for Multilingual Europe
  • 7. Curation Technologies for Multilingual Europe CurationDashboard Structure visualisation Multilingual multimedia sources Crossmedia recommendations Multilingual summarisation Event timelining Semantification of content Multilingual sentiment analysis Semantic storytelling Ontology-based knowledge structures Automatic hyperlinking of document collections Curation Processes Processing, exploration and 
 re-aggregation of domain- and task- specific document collections.
  • 8. Key Characteristics •  Technology transfer and integration project •  Broad set of tools and technologies •  Focus on building proofs of concept •  Our technologies don’t have to be perfect •  Human expert, i.e., the curator, always in the loop •  Important for all SME partners: domain-adaptability. •  WPs: Semantic Analysis, Semantic Generation, Multilingual Technologies, Integration into Curation Tech Curation Technologies for Multilingual Europe
  • 9. platform for digital curation technologies broker REST API curation service 1 language or knowledge technology curation service 2 language or knowledge technology client using 
 the API external service 1 external service 2 client using 
 the API client using 
 the API client using 
 the API pipelined curation workflow Curation Technologies for Multilingual Europe •  Curation process: e-service available through REST API. •  Services can be combined to form pipelines or workflows. •  Domain-adaptability: every curation process has a training API to create and use domain-specific models.
  • 10. Current Results •  Implemented the following baseline services: –  NER – e-entityrecognition e-service –  Geolocation – e-entityrecognition and visualisation –  Temporal Analyser – e-entityrecognition and visualisation –  Classification – e-classification e-service –  Clustering – e-clustering e-service –  Machine Translation – e-translation e-service •  Curation Dashboard (first prototype) •  Semantic Storytelling (work in progress) Curation Technologies for Multilingual Europe
  • 11. NER, Entity Linking, Geolocation Curation Technologies for Multilingual Europe ... In the Viking colony of Iceland, an extraordinary vernacular literature blossomed in the 12th through 14th centuries ... ...
 The ships were scuttled there in the 11th century, to block a
 navigation channel and thus 
 protect Roskilde, then 
 Copenhagen from seaborne assault
 ... ...
 Viking Age inscriptions have 
 also been discovered on the 
 Manx runestones on the 
 Isle of Man.
 … Plain Text NIF enrichment visualisation http://api.digitale-kuratierung.de/api/e-nlp/namedEntityRecognition?analysis=ner http://http://dev.digitale-kuratierung.de/admini/pages/geolocalization.php •  Currently based on OpenNLP (with NIF integration) •  Mode 1: model-based (for domains where annotated data is available) •  Mode 2: dictionary-based (for domains where only a list of names is available) •  Entity Linking through SPARQL queries to DBPedia •  For locations, GPS-coordinates are retrieved, document level average and standard deviation (over all locations) are calculated to visualise positioning of documents on a map.
  • 12. Curation Technologies for Multilingual Europe NER Training http://api.digitale-kuratierung.de/api/e-nlp/trainModel?analysis=dict 
 (in the suboptimal case that only a list of terms and their URIs in an ontology is available)
 http://api.digitale-kuratierung.de/api/e-nlp/trainModel?analysis=ner
 (if annotated training data is available)
 directly usable on new input NER model
  • 13. Curation Technologies for Multilingual Europe Temporal Analysis ...
 The ships were scuttled there in the 11th century, to block a
 navigation channel and thus 
 protect Roskilde, then 
 Copenhagen from seaborne assault
 ... ...
 Viking Age inscriptions have 
 also been discovered on the 
 Manx runestones on the 
 Isle of Man.
 ... ... In the Viking colony of Iceland, an extraordinary vernacular literature blossomed in the 12th through 14th centuries … 900 1600 http://api.digitale-kuratierung.de/api/e-nlp/namedEntityRecognition?analysis=temp http://dev.digitale-kuratierung.de/admini/pages/timelining.php Plain Text NIF enrichment visualisation •  Sort and rank documents from a collection on chronological scale. •  Developed rule-based system due to our focus in terms of languages (EN, DE), domain adaptability, normalisation requirements. •  Analysis of temporal expressions in a document (or, later, paragraphs or even sentences). •  Compute mean value for date and time, allowing positioning on a timeline. •  Future plans: adaptability through user-specific rules. •  Related work: SUTime, HeidelTime, Tango, Tarsgi; many papers at LREC 2016
  • 14. Classification •  Mallet – Maximum Entropy Algorithm •  Algorithm for text classification, easy integration. •  Goal: text classification, i.e., assign a topic (class) to a document (or parts of a document) to apply domain- or topic- specific NLP processing techniques. •  Future plans: improvement of classification schema by means of new training data and additional algorithms. Curation Technologies for Multilingual Europe @prefix rdf:    <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix xsd:   <http://www.w3.org/2001/XMLSchema#> . @prefix nif:    <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> . @prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> . 
 <http://dkt.dfki.de/documents/#char=0,1257> a nif:RFC5147String , nif:String , nif:Context ; nif:beginIndex "0"^^xsd:nonNegativeInteger ; nif:endIndex "1257"^^xsd:nonNegativeInteger ;        nif:documentClassificationLabel "Frühjahrsoffensive_1918"^^xsd:string ; nif:isString "Ceylon-Teestube B. Walther München Maximilian-Strasse 44 Gegenüber dem Königl. Hoftheater Telephon 428 München, den 26.XI.13. Von hier nach Dresden ab München 8.25 9.00 10.20 an Dresden 7.28 10.47 9.48 Sie müssen unbedingt Donnerstag hier bleiben. So können Sie doch nicht vorbeifahren. Donnerstag Abend eine interessante Uraufführung in den Kammerspielen "unseligen Gedenkens " Ich werde Billets dafür besorgen. […]"^^xsd:string .
  • 15. Clustering •  WEKA (Expectation Maximisation algorithm) •  Easy integration, availability, additional algorithms. •  Goal: identification of distinct features of document collections. •  Example use case: a user has to prepare a museum exhibit on “Birds”. Knowing which documents can be grouped can be useful to split the documents into exhibition rooms. •  Future plans: allow users to easily recognize groups of documents in new domains and collections; faceted search. Curation Technologies for Multilingual Europe ARFF Input JSON Output @RELATION iris @ATTRIBUTE sepallength  NUMERIC @ATTRIBUTE sepalwidth   NUMERIC @ATTRIBUTE petallength  NUMERIC @ATTRIBUTE petalwidth   NUMERIC 
 @DATA 5.1,3.5,1.4,0.2 4.9,3.0,1.4,0.2 4.7,3.2,1.3,0.2 4.6,3.1,1.5,0.2 5.0,3.6,1.4,0.2 5.4,3.9,1.7,0.4 4.6,3.4,1.4,0.3 5.0,3.4,1.5,0.2 4.4,2.9,1.4,0.2 4.9,3.1,1.5,0.1 { "results": { "numberClusters": -1, "clusters": {"cluster1": {   "clusterId": 1, "entitites": {    "entity1": {     "meanValue": 3.3099999999999996,     "label": "sepalwidth"   },   "entity2": {     "meanValue": 1.45,     "label": "petallength"    },   "entity3": {     "meanValue": 0.22000000000000003,     "label": "petalwidth"    } } }}}}
  • 16. Machine Translation Curation Technologies for Multilingual Europe Workflow Language & Translation Models trained on DGT, News, Europarl, TED Herr Modi befindet sich auf einer fünftägigen Reise nach Japan, um die wirtschaftlichen Beziehungen mit der drittgrößten Wirtschaftsnation der Welt zu festigen. Mr Modi is located on a five-day trip to Japan to strengthen the economic ties with the third largest economy in the world. Named Entity Recognition Entity Linking Temporal Expressions Metadata Processing Post-Edit Retraining Example •  Robust, adaptable and customised models of MT as e-services (Moses-based SMT) •  Scenarios: museums, showrooms; news, media; publishers; cultural institutions, archives •  Integration in curation workflows with other DKT services (NER, Temporal Analyser) •  Plug-in multiple knowledge sources (Linked Data)
  • 17. Semantic Storytelling •  Important objective for all partner use cases: Automatic hyper-linking of task-specific, self-contained collections. •  Input: coherent, self-contained document collection •  Output: processed collection with added analysis information, easily accessible as a hypertext, for efficient browsing •  Semantic Storytelling – operates on the hypertext graph that we construct on top of the original collection •  Enables multiple different paths through the collection •  Semantic storytelling is the identification, ranking and recommendation of meaningful hypertext paths. Curation Technologies for Multilingual Europe
  • 18. Curation Technologies for Multilingual Europe <http://d-nb.info/gnd/11858071X, met, http://d-nb.info/gnd/129094722> http://dev.digitale-kuratierung.de/2ds3/index.php <http://d-nb.info/gnd/118589768, wrote, http://d-nb.info/gnd/118623230> <http://d-nb.info/gnd/123242231, visited, http://d-nb.info/gnd/188402519> <http://d-nb.info/gnd/118569015, said, http://d-nb.info/gnd/11947509X> <http://d-nb.info/gnd/119173425, was, http://d-nb.info/gnd/118629867> <http://d-nb.info/gnd/119178893, designed, http://d-<nb.info/gnd/118629867> <http://d-nb.info/gnd/118876759, love, http://d-nb.info/gnd/118629867> <http://d-nb.info/gnd/118545892, depart, http://d-nb.info/gnd/107363569> <http://d-nb.info/gnd/128830751, write, http://d-nb.info/gnd/118606026> <http://d-nb.info/gnd/11858071X, protect, http://d-nb.info/gnd/39650438> <http://d-nb.info/gnd/116713704, married, http://d-nb.info/gnd/52754181> … 1 2 3 45
  • 19. Curation Technologies for Multilingual Europe Curation Dashboard
  • 20. Conclusions •  Curation technologies are smart technologies to support knowledge workers handling content and knowledge. •  The multilingual Digital Single Market will create a massive need for multilingual Curation Technologies due to an ever-increasing need for multilingual content. •  DKT is mostly centred around German and English. •  We cater for a small set of curation processes. •  To be extended in a larger follow-up project. •  Extended set of curation processes, more complex approaches, many more languages. Curation Technologies for Multilingual Europe