SlideShare una empresa de Scribd logo
1 de 70
Descargar para leer sin conexión
The State of Computational Morphology for Europe’s
Languages and the META-NET Strategic Research Agenda
Georg Rehm
Network Manager META-NET
DFKI, Berlin, Germany
georg.rehm@dfki.de

3rd Int. Workshop on Systems and Frameworks for Computational Morphology (SFCM 2013)
Berlin, Germany – September 06, 2013

Co-funded by the 7th Framework Programme and the ICT Policy Support Programme of the European Commission through
the contracts T4ME, CESAR, METANET4U, META-NORD (grant agreements no. 249119, 271022, 270893, 270899).
Outline
q 

Introduction

q 

Language White Paper Series: Europe’s Languages in the Digital Age

q 

The State of Computational Morphology for Europe’s Languages

q 

The META-NET Strategic Research Agenda for Multilingual Europe

q 

Conclusions

http://www.meta-net.eu

2
Multilingual Europe
q 

q 

q 

q 

Where were we back in 2010?
Challenge: Providing each language community with the most
advanced technologies for communication and information so that
maintaining their mother tongue does not turn into a disadvantage.
While research has made considerable progress in recent years, the
pace of progress is not fast enough to meet the challenge within the
next 10-20 years.
All stakeholders – researchers, LT user and provider industries,
language communities, funding programmes, policy makers –
should team up in a strategic alliance for a major dedicated
push.

http://www.meta-net.eu

3
Objectives
META-NET is a network of excellence dedicated to fostering the technological foundations of the European multilingual information society.

http://www.meta-net.eu

4
Four EU-Funded Projects
q 

q 

q 

q 

q 

Initial project: T4ME (FP7;
13 partners, 10 countries)
Three ICT-PSP consortia
since Feb. 2011: CESAR,
METANET4U, META-NORD
All four projects ended on
January 31, 2013.
All EU member states and
several non-member states
covered.
META-NET in Sept. 2013:
60 members in 34 countries.

http://www.meta-net.eu

http://www.meta-net.eu/members

5
Europe’s Languages in the Digital Age

Language White Paper Series

http://www.meta-net.eu

6
Language White Paper Series
q 
q 

q 
q 

q 

“Europe’s Languages in the Digital Age”.
Reports on the state of our languages in
the digital age and the level of support
through language technology.
Series covers 30 languages.
Key communication instruments to
address decision makers and journalists.
Inform about societal and technological
problems and challenges as well as
economic opportunities.

q 

>2 years in the making.

q 

>200 national experts as contributors.

q 

>8.000 copies printed and distributed to
politicians and journalists.

http://www.meta-net.eu

7
Language White Paper Series
q 

Structure:
§ 
§ 
§ 
§ 
§ 

q 

Part 1: Executive Summary
Part 2: Languages at Risk — A Challenge for Language Technology
Part 3: The [X] Language in the European Information Society
Part 4: LT support for [X]
Part 5: About META-NET; References, etc.

Language White Paper Series (published at Springer):
§  Ca. 8.000 printed copies distributed by META-NET.
§  Printed copies can be purchased through the usual channels.
§  Ebooks available via SpringerLink (fee) and META-NET website (free).
§  http://www.meta-net.eu/whitepapers

http://www.meta-net.eu

8
30 Languages Covered
q 
q 
q 
q 
q 
q 
q 
q 
q 
q 

Basque
Bulgarian*
Catalan
Czech*
Danish*
Dutch*
English*
Estonian*
Finnish*
French*

q 
q 
q 
q 
q 
q 
q 
q 
q 
q 

Galician
q  Norwegian
German*
q  Polish*
Greek*
q  Portuguese*
Hungarian*
q  Romanian*
Icelandic
q  Serbian
Irish*
q  Slovak*
Italian*
q  Slovene*
Latvian*
q  Spanish*
Lithuanian*
q  Swedish*
Maltese*
q  Croatian
Next up: Welsh

* = Official EU language
http://www.meta-net.eu

9
Cross-Lingual Comparison
q 

In four application areas, each language is assigned to one of five
clusters, ranging from excellent LT support to weak/no support:
1.  Machine Translation
2.  Speech Processing
3.  Text Analytics
4.  Language Resources

q 

Results finalised at a
meeting in Berlin with
representatives of all
30 languages
(October 21/22, 2011).

http://www.meta-net.eu

10
Resources

Speech

Text Analysis

MT

excellent

good

moderate

fragmentary

weak or no support

English

moderate

fragmentary

weak or no support

Dutch, French,
German, Italian,
Spanish

Basque, Bulgarian, Catalan, Czech,
Danish, Finnish, Galician, Greek,
Hungarian, Norwegian, Polish,
Portuguese, Romanian, Slovak,
Slovene, Swedish

Croatian, Estonian, Icelandic, Irish,
Latvian, Lithuanian, Maltese, Serbian

good

moderate

fragmentary

weak or no support

Czech, Dutch, Finnish,
French, German,
Italian, Portuguese,
Spanish

Basque, Bulgarian, Catalan, Danish,
Estonian, Galician, Greek,
Hungarian, Irish, Norwegian, Polish,
Serbian, Slovak, Slovene, Swedish

Croatian, Icelandic, Latvian,
Lithuanian, Maltese, Romanian

good

moderate

fragmentary

weak/no support

English

excellent

good

English

excellent

Catalan, Dutch, German, Hungarian,
Italian, Polish, Romanian

English

excellent

French, Spanish

Basque, Bulgarian, Croatian, Czech,
Danish, Estonian, Finnish, Galician,
Greek, Icelandic, Irish, Latvian, Lithuanian, Maltese, Norwegian, Portuguese,
Serbian, Slovak, Slovene, Swedish

Czech, Dutch, French,
German, Hungarian,
Italian, Polish,
Spanish, Swedish

Basque, Bulgarian, Catalan, Croatian,
Danish, Estonian, Finnish, Galician,
Greek, Norwegian, Portuguese,
Romanian, Serbian, Slovak, Slovene

Icelandic, Irish, Latvian,
Lithuanian, Maltese

http://www.meta-net.eu

11
Europe’s Languages and LT

English

good support through
Language Technology

http://www.meta-net.eu

Dutch
French
German
Italian
Spanish

Catalan
Czech
Finnish
Hungarian
Polish
Portuguese
Swedish

Basque
Bulgarian
Danish
Galician
Greek
Norwegian
Romanian
Slovak
Slovene

Croatian
Estonian
Icelandic
Irish
Latvian
Lithuanian
Maltese
Serbian

weak or
no support

12
450	
  

400	
  

350	
  

300	
  

Languages treated in the 2010 editions of
Not	
  enough	
  R&I	
  on	
  European	
  languages	
  of Computational Linguistics and
Journal
Conferences of ACL, EMNLP and COLING.
Many European languages with no reference
at all: Slovak, Maltese, Lithuanian, Irish,
➔  LT	
  research	
  on	
  European	
  languages,	
  except	
  for	
  English,	
  	
  is	
  too	
  weak	
  and	
  
Albanian, Croatian, Galician etc.

too	
  slow	
  	
  	
  

250	
  

➔  Many	
  languages	
  are	
  badly	
  covered	
  

200	
  

150	
  

100	
  

0	
  

English	
  
Chinese	
  
German,	
  Standard	
  
French	
  
Spanish	
  
Japanese	
  
Arabic	
  
Dutch	
  
Portuguese	
  
Czech	
  
Danish	
  
Swedish	
  
Hindi	
  
Korean	
  
Turkish	
  
Italian	
  
Russian	
  
Finnish	
  
Hebrew	
  
Hungarian	
  
Slovene	
  
Urdu	
  
Romanian	
  
Zulu	
  
Bulgarian	
  
Catalan-­‐Valencian-­‐Balear	
  
Greek	
  
Thai	
  
Welsh	
  
Estonian	
  
Basque	
  
German,	
  Swiss	
  
InukStut	
  
Indonesian	
  
Ineseño	
  
LaSn	
  
Marathi	
  
Malay	
  
Pushto	
  
Serbian	
  
Syriac	
  
Tamil	
  
UgariSc	
  
Ukrainian	
  
Uspanteko	
  
Vietnamese	
  

50	
  
Key Observations
q 

When it comes to Language Technology support, there are massive
differences between Europe’s languages and technology areas.

q 

LT support for English is ahead of any other language.

q 

Even support for English is far from being perfect.

q 

The gap between English and the other languages keeps widening!

q 

q 

Several languages – Icelandic, Latvian, Lithuanian, Maltese – receive
this weakest score in all four areas!
At least 21 European languages in danger of digital extinction!
(Languages put into the “weak or no support” category at least once.)

http://www.meta-net.eu

14
White Paper Box Sets (100 copies)

http://www.meta-net.eu

15
White Paper Website

http://www.meta-net.eu

16
Press Campaign
q 

Headline of press release:
At Least 21 European Languages in Danger of Digital Extinction.

q 

Sent out to journalists, politicians and other stakeholder groups on the
European Day of Languages (Sept. 26, 2012).

q 

Overwhelmed by the huge interest in the topic and our key findings!

q 

600+ mentions in the press.

q 

50+ broadcast interviews with META-NET representatives (ca. 30 radio
interviews, ca. 25 television reports).

q 

News came in from 40+ countries in 35+ different languages.

q 

Whole of Europe covered.

q 

Two Parliamentary Questions in the European Parliament on the
“digital extinction of languages” topic.

http://www.meta-net.eu

17
Coverage by Country
Basque Country, Austria, 0.20%
0.40%
Costa Rica, 0.20%
Finland, 0.70%
Portugal, 0.40%
Canada, 0.20%
Brazil, 0.40%
Sweden, 0.70%
New Zealand, 0.20%

Spain
Bulgaria
International
Latvia
Mexico, Slovakia,
Belgium, 0.90%
Netherlands
Greece
0.40% 0.40%
Bosnia and Herzegovina,
UK, 1.10%
Romania
Cyprus, 0.20%
Norway, 0.40%
Ireland, 1.30%
0.20% Serbia
Australia, 0.20%
Italy
Lithuania, 1.30%
Poland,
Germany
Russia
0.70%
Hungary, 0.20%
Estonia
Denmark, Latin America, 1.30%
France
Slovenia
1.30%
USA, 1.50%
Iceland
Malta
Malta, 2%
Spain, 15.90%
Iceland, 2.20%
USA
Denmark
Slovenia, 2.40%
Latin America
Lithuania
Bulgaria,
Ireland
France, 2.60%
10.80%
UK
Belgium
Estonia, 2.90%
Finland
Sweden
Russia, 3.50%
Poland
International, 7.90%
Norway
Mexico
Germany, 3.50%
Brazil
Slovakia
Italy, 4.20%
Basque Country
Latvia, 5.30%
Portugal
Serbia, 4.40%
Austria
New Zealand
Netherlands, 4.80%
Hungary
Bosnia and Herzegovina
Greece, 4.60%
Costa Rica
Romania, 4.40%
Cyprus
Canada
Australia

http://www.meta-net.eu

18
Response: Examples
q 
q 
q 
q 
q 
q 
q 
q 
q 
q 
q 
q 
q 
q 

Austria:
Denmark:
Finland:
Germany:
Greece:
Hungary:
Iceland:
Italy:
Norway:
Slovenia:
Serbia:
Spain:
UK:
USA:

http://www.meta-net.eu

Der Standard.
Politiken, Berlingske Tidende.
Tiede.
Heise Newsticker, Süddeutsche Zeitung.
in.gr, Πρώτο Θέµα, Prosilipsis.
Origo.
Fréttablaðið, Morgunblaðið.
Wired.
Computerworld.
Delo, Dnevnik, Demokracija.
Politika.
El Mundo.
Huffington Post.
Mashable, NBC News, Reddit.
19
pakker.
»Pengene betyder, at der kommer bedre
forhold for kræftpatienter. Det er vigtigt, at
folk får mulighed for at blive behandlet hurtigt, så de ikke skal gå rundt og være bekymrede,« siger formand for kvalitetsudvalget i
Region Hovedstaden, Kirsten Lee (R).

rende niveau.
I Kræftens Bekæmpelse hilser direktør Leif
Vestergaard Pedersen det velkomment, at
Region Hovedstaden nu bruger 32 mio. kr. til
at udvide kapaciteten.
»Det har vist sig, at der er et forbedringspotentiale på dette område, og derfor er det godt,
at man prioriterer det. Flere og flere får kræft,
og flere og flere overlever. Det betyder, at kapaciteten gradvist skal øges hele tiden. Servicemål er et godt initiativ, og et mål på 90-95 pct.
er nok det realistiske, selv om udgangspunktet bør være 100 procent,« siger Leif Vestergaard Pedersen og tilføjer:
»Men så er det også vigtigt at holde fast i det
mål og ikke stille sig tilfreds med, at 80 eller 85
pct. kommer igennem til tiden.« B

76

Sådan læses grafikken:
Procentdel
uden for
servicemål
Procentdel
inden for
servicemål

Press Campaign: Highlights
Flere får kræft – og flere overlever
Konkret er hensigten at udvide den onkologiske kapacitet – det vil sige stråle- og kemobehandlingen – på såvel Rigshospitalet, Herlev
Hospital, Hillerød Hospital og Bornholms
Hospital.
Desuden sættes der penge af til at øge antallet af operationer og udvide ambulatoriekapaciteten på det urologiske område på Herlev,

Positiv udvikling

Negativ udvikling

INFOGRAFIK: HENRIK KIÆR / TEKST: FLEMMING STEEN PEDERSEN

KILDE: REGION HOVEDSTADEN

Ord. Forskere arbejder på at forbedre danske oversættelser på internettet.

Dårlig sprogteknologi truer dansk på nettet
Af Jens Ejsing
// ejs@berlingske.dk

Det danske sprog har det svært i den digitale
verden.
Det konstaterer danske sprogforskere- og
eksperter i forbindelse med den nye internationale undersøgelse META-NET, der ser
nærmere på, hvordan en lang række mindre,
europæiske sprog som dansk klarer sig i den
digitale verden.
Forskerne fra bl.a. Københavns Universitet
og Dansk Sprognævn når frem til, at dansk
i fremtiden kan få det endnu sværere i den
digitale verden, fordi Google Translate, GPSer,
applikationer til smartphones og andre sprogteknologiske programmer ikke i tilstrækkelig
grad formår at behandle de mange nuancer i
det danske sprog.
Professor i sprogteknologi på Københavns
Universitet, Bolette Sandford Pedersen,
mener, at der er brug for en slags digital dansk
sprogbank fyldt med data, så bl.a. oversættelser bliver så præcise og gode som muligt. Med

http://www.meta-net.eu

hjælp fra sprogbanken kan forskere ifølge
professoren hjælpe virksomheder med at forbedre programmer, der skal håndtere sproglig
viden om bl.a. maskinoversættelse, talegenkendelse og informationssøgning.
Dermed vil der blive længere mellem fejlagtige oversættelser, som når »hæld olie på panden« med Google Translate bliver til »pour oil
on the forehead« på engelsk. Oversættelser,
der er i værste fald er så upræcise, at danskere
ender med at fravælge deres eget sprog i den
digitale verden.
Sproghjælp til virksomheder
Hun anerkender dog, at »teknologien til automatiske oversættelser på mange måder er
fantastisk«.
»Den er bare ikke god nok, når det gælder
dansk,« siger hun:
»Det er som om, at vi i et vist omfang lægger
det i hænderne på Google eller andre virksomheder at afgøre, om dansk skal behandles
godt nok eller ej. Men det danske marked
er ikke stort for dem. Spørgsmålet er derfor,

fakta H

Sprog i Europa
H Der er omkring 80 sprog i EU. For 21 af
dem – også dansk – gælder det, at der er
store sprogteknologiske mangler, når det
gælder bl.a. maskinoversættelse, talegenkendelse og informationssøgning.
H Ifølge en EU-undersøgelse køber et
stigende antal europæiske internetbrugere
varer eller tjenester på nettet, hvor det sprog,
der bliver anvendt, ikke er deres eget. Det
gælder over halvdelen af brugerne.
H Over hver tredje anvender et fremmedsprog til at skrive mail eller indlæg på nettet.

om vi ikke i højere grad selv skal gøre noget
for at sikre, at det fornødne datamateriale er
til rådighed, så vi får gode oversættelser og
anden god sprogteknologi. Det kunne f.eks.
være ved, at vi gjorde en indsats for at få oprettet en sprogbank med en masse beriget materiale om dansk.«
»Hvis vi hele tiden oplever, at oversættelser er behæftede med fejl, tør vi ikke stole på
dem,« siger hun og understreger, at »fejlagtige
oversættelser kan føre til store misforståelser«.
Ifølge Dansk Sprognævns direktør, Sabine
Kirchmeier-Andersen, kan dårlig sprogteknologi have konsekvenser for mange danskere,
der ikke er så gode til engelsk.
»Hvis vi har ambitioner om at bruge det
danske sprog i fremtidens teknologiske
univers, skal der gøres en indsats nu for at
fastholde ekspertise og udbygge den viden, vi
har,« mener hun:
»Ellers risikerer vi, at kun folk, der taler flydende engelsk, vil få glæde af de nye generationer af web-, tele- og robotteknologi, der er på
vej.« B

20
Press Campaign: Highlights
38

Πέµπτη 27 Σεπτεµβρίου 2012 ΕΛΕΥΘΕΡΟΣ ΤΥΠΟΣ

Life

Date 30 September 2012
Page 16

Γιώργος
Μπαµπινιώτης.

GREEKLISH

Η γλώσσα της
αποξένωσης…

ΠΟΛΛΕΣ ΕΥΡΩΠΑΪΚΕΣ ΓΛΩΣΣΕΣ ΘΕΩΡΟΥΝΤΑΙ ΤΕΧΝΟΛΟΓΙΚΑ… ΞΕΠΕΡΑΣΜΕΝΕΣ

Με ψηφιακή εξαφάνιση
κινδυνεύουν τα ελληνικά
Σ
την ψηφιακή εποχή δεν…
µιλούν ελληνικά, όπως και
αρκετές άλλες ευρωπαϊκές
γλώσσες, σύµφωνα µε πανευρωπαϊκή έκθεση µε την υπογραφή 200 και
πλέον ειδικών. Η συγκεκριµένη µελέτη δηµοσιεύτηκε από το επιστηµονικό
δίκτυο ΜΕΤΑ-ΝΕΤ µε αφορµή τη χτεσινή Ευρωπαϊκή Ηµέρα Γλωσσών.
Για τις ανάγκες της έρευνάς τους,
γλωσσολόγοι από 34 χώρες της Γηραιάς Ηπείρου βαθµολόγησαν τις
διαθέσιµες γλωσσικές υπηρεσίες
και δηµιούργησαν ένα «Λευκό Βιβλίο» για κάθε ευρωπαϊκή γλώσσα.
Στη µελέτη τους, οι ειδικοί αναζήτησαν µεταξύ άλλων τέσσερα βασικά
ηλεκτρονικά εργαλεία, δηλαδή την
ύπαρξη αυτόµατης µετάφρασης,
τη δυνατότητα φωνητικής αλληλεπίδρασης και ψηφιακής ανάλυσης
κειµένου, ενώ ταυτόχρονα διερευνήθηκε και η διαθεσιµότητα γλωσσικών
πόρων ή πηγών.
Σε πρώτη φάση εξέτασαν τις ιστοσελίδες που επιτρέπουν στους χρήστες να κάνουν µεταφράσεις online,
όπως, για παράδειγµα, η υπηρεσία
του κολοσσού πληροφορικής Google
Translate. Την ίδια ώρα, εξετάστηκε
και η «επικοινωνία» των ελληνόφωνων χρηστών µε τις…συσκευές τους,
όπως για παράδειγµα η δυνατότητα

ΕΛΕΝΗ ΒΕΡΓΟΥ
evergou@e-typos.com

να «µιλήσει» κάποιος στο GPS στη
µητρική του γλώσσα. Οι ερευνητές
κατέληξαν στο συµπέρασµα ότι
υπάρχουν τέτοιες συσκευές, αλλά
δεν είναι τόσο διαδεδοµένες όσο οι
αγγλόφωνες.
Το «χρυσό» µετάλλιο κατακτά,
όπως είναι άλλωστε και λογικό, η
αγγλική γλώσσα. Οι αγγλόφωνοι χρήστες έχουν την καλύτερη δυνατή τεχνολογική υποστήριξη, κάτι το οποίο
ευνοεί την περαιτέρω εξάπλωση της
γλώσσας. Από «τεχνολογικό αποκλεισµό» κινδυνεύουν περισσότερο
η ισλανδική, η λετονική, η λιθουανική
και η µαλτέζικη γλώσσα, ενώ σε λίγο
καλύτερη µοίρα βρίσκονται η ελληνική, η βουλγαρική, η ουγγρική και
η πολωνική, που όπως αναφέρει η
έρευνα έχουν «αποσπασµατική» τεχνολογική υποστήριξη.
«Μέτρια» χαρακτηρίζεται η υποστήριξη χρηστών σε ολλανδική, γαλλική, γερµανική, ιταλική και ισπανική
γλώσσα. Οι επικεφαλής της επιστηµονικής οµάδας, Χανς Ουζκοράιτ και
Γκεόργκ Ρεµ, αναφέρουν χαρακτηριστικά: «Υπάρχουν δραµατικές διαφορές στην υποστήριξη της γλωσσικής

http://www.meta-net.eu

τεχνολογίας ανάµεσα στις διάφορες
ευρωπαϊκές γλώσσες. Το χάσµα µεταξύ “µικρών” και “µεγάλων” γλωσσών
ολοένα και διευρύνεται. Πρέπει να
εξασφαλίσουµε τον εφοδιασµό των
µικρότερων και λιγότερο πλούσιων
σε ψηφιακούς πόρους γλωσσών µε
τις απαραίτητες βασικές τεχνολογίες. ∆ιαφορετικά, οι γλώσσες αυτές
είναι καταδικασµένες σε ψηφιακή
εξαφάνιση».
Μάλιστα, οι ειδικοί τονίζουν ότι χωρίς αποφασιστική δράση οι γλώσσες
αυτές δύσκολα θα… επιβιώσουν στον
ψηφιακό κόσµου του 21ου αιώνα. Η
κ. Μαρία Γαβριηλίδου, µέλος της επιστηµονικής οµάδας από το Ινστιτούτο

Οι αγγλόφωνοι
χρήστες έχουν
την καλύτερη
δυνατή τεχνολογική
υποστήριξη,
γεγονός που ευνοεί
την περαιτέρω
εξάπλωση
της γλώσσας

Επεξεργασίας του Λόγου Ερευνητικό
Κέντρο Αθηνά, λέει στον «Ε.Τ.»: «Η
έρευνα αυτή δεν λέει ότι δεν θα ζήσει
η ελληνική γλώσσα ή ότι κινδυνεύει
µε εξαφάνιση». Η ειδικός εξηγεί ότι
όσο υπάρχουν άνθρωποι που µιλάνε, γράφουν και επικοινωνούν µε µια
γλώσσα, τότε αυτή θα συνεχίσει να
υπάρχει. Είναι σηµαντικό, όµως, να
έχουν όλοι οι χρήστες τη δυνατότητα
να «µιλήσουν» στις µηχανές, όπως τα
GPS τους, στα ελληνικά και να έχουν
στη διάθεσή τους γλωσσικά εργαλεία
ηλεκτρονικών υπολογιστών.
Μεταξύ αυτών των «εργαλείων»
είναι οι διορθωτές ορθογραφικών και
συντακτικών λαθών, που χρησιµοποιούνται καθηµερινά από εκατοντάδες
Ελληνες χρήστες και βασίζονται στη
γλωσσική τεχνολογία.
Παρ’ όλα αυτά, τονίζει ότι η ψηφιακή εξάπλωση µιας γλώσσας είναι
σηµαντική «∆εν είναι στα χέρια του
µέσου χρήστη. Οι εκάστοτε κυβερνήσεις, η Ευρωπαϊκή Ενωση και ο
ιδιωτικός τοµέας πρέπει να χρηµατοδοτήσουν την ανάπτυξη αυτής της
τεχνολογίας για όλες τις γλώσσες»,
αναφέρει και συνεχίζει: «Οι χρήστες,
όµως, πρέπει να απαιτούν να υπάρχουν και στη γλώσσα τους τα µέσα
αυτά και να µην ικανοποιούνται µε
τα αγγλικά». ■

ΜΕ GREEKLISH επικοινωνούν πλέον µέσω µηνυµάτων ή email οι περισσότεροι
νέοι της χώρας µας. Παρά
το γεγονός ότι τα τελευταία χρόνια υπάρχουν τα
γλωσσικά εργαλεία, τα
οποία επιτρέπουν τη χρήση
της ελληνικής γραµµατοσειράς, έφηβοι και νέοι
ενήλικες φαίνεται ότι δεν
έχουν «αγκαλιάσει» αυτές
τις τεχνολογίες. Ο καθηγητής Γλωσσολογίας, κ.
Γιώργος Μπαµπινιώτης, λέει
στον «Ε.Τ.»: «Τα greeklish
είναι πρόβληµα για την
ελληνική γλώσσα, ιδίως για
ανθρώπους νέας ηλικίας
για έναν καθαρά γλωσσικό
λόγο. Με τη χρήση των
greeklish αποξενώνονται
από τη µορφή της λέξης ή
όπως λέµε το ετυµολογικό
ίνδαλµα που δηλώνεται µε
την ορθογραφία της λέξης
και συνδέεται και µε τη σηµασία της λέξης και µε την
προέλευσή της». Ο κίνδυνος,
µε τον οποίο έρχονται αντιµέτωποι οι νέοι άνθρωποι,
είναι η αποξένωση από τη
γραπτή µορφή της γλώσσας. Αυτή η «οικειότητα»,
όµως, βοηθάει και στην
κατανόηση της σηµασίας
αλλά και την προέλευση της
λέξης. «Αυτή η αποξένωση
δεν είναι άνευ σηµασίας»,
αναφέρει ο ειδικός, ο οποίος
εξηγεί ότι η διαδικασία της
γραφής βοηθάει να εντυπωθεί η λέξη και να συνδεθεί
µε άλλες οµόρριζες λέξεις.
«Οταν χρησιµοποιείται αυτή
η µορφή επικοινωνίας, καταστρέφονται, ατονούν. ∆εν
είναι προς θάνατο, αλλά θα
κάνει ζηµιά», αναφέρει ο
κ. Μπαµπινιώτης, ο οποίος
συµβουλεύει τους χρήστες
να επιλέγουν την ελληνική
γραµµατοσειρά.

Copyright material. This may only be copied under the terms of a Newspaper Licensing Agency
agreement (www.nla.co.uk) or with written publisher permission.
For external republishing rights see www.nla-republishing.com

21
Press Campaign: Highlights
049-ΚΟΣΜΟΣ 29/09/2012 1:41 ? Μ Page 49

49

KYPIAKH 30 ΣΕΠΤΕΜΒΡΙΟΥ 2012

Οι περισσότερες ευρωπαϊκές γλώσσες
κινδυνεύουν µε ψηφιακή εξαφάνιση

Τη γλώσσα
µού... έχασαν
Πρέπει να εξασφαλιστεί ο εφοδιασµός των µικρότερων και λιγότερο πλούσιων
-σε ψηφιακούς πόρους- γλωσσών µε τις απαραίτητες βασικές τεχνολογίες

Η

26η Σεπτεµβρίου έχει καθιερωθεί από το Συµβούλιο της
Ευρώπης ως η Ευρωπαϊκή
Ηµέρα των Γλωσσών, αλλά,
σύµφωνα µε µια νέα ευρωπαϊκή επιστηµονική έκθεση, οι 21 από τις 30
γλώσσες της Ευρώπης -µεταξύ των οποίων και η Ελληνική- αντιµετωπίζουν κίνδυνο ψηφιακής εξαφάνισης.
Η έρευνα κρούει τον κώδωνα κινδύνου, καθώς διαπίστωσε ότι η ψηφιακή
βοήθεια για τις περισσότερες ευρωπαϊκές
γλώσσες είναι ελλιπής ή απολύτως ανύπαρκτη για τους χρήστες.

Τις έφαγαν οι κοινές
Η έκθεση, µε τη µορφή µιας σειράς
Λευκών Βίβλων (µε τίτλο «Γλώσσες στην
Ευρωπαϊκή Κοινωνία της Πληροφορίας»),
από το επιστηµονικό δίκτυο ΜΕΤΑΝΕΤ, το οποίο συνενώνει 60 ερευνητικά
κέντρα σε 34 χώρες, επισηµαίνει ότι οι
γλώσσες που µιλιούνται από σχετικά
µικρό αριθµό ανθρώπων κινδυνεύουν,
επειδή δεν έχουν τεχνολογική υποστήριξη όπως έχουν οι ευρέως χρησιµοποιούµενες γλώσσες. Λευκές Βίβλοι
έχουν καταρτιστεί για τις εξής ευρωπαϊκές γλώσσες: αγγλικά, βασκικά,
βουλγαρικά, γαλικιανά, γαλλικά, γερµανικά, δανικά, ελληνικά, εσθονικά,
ιρλανδικά, ισλανδικά, ισπανικά, ιταλικά,
καταλανικά, κροατικά, λετονικά, λιθουανικά, µαλτέζικα, νορβηγικά (µπουκµόλ και νινόρσκ), ολλανδικά, ουγγρικά,
πολωνικά, πορτογαλικά, ρουµανικά,
σερβικά, σλοβακικά, σλοβενικά, σουηδικά, τσεχικά και φινλανδικά. Κάθε
Λευκή Βίβλος είναι γραµµένη στη γλώσσα στην οποία αναφέρεται και είναι
µεταφρασµένη στα αγγλικά.

Τέσσερις µεγάλοι κίνδυνοι
Σύµφωνα µε τη νέα µελέτη, η Ισλανδική, η Λετονική, η Λιθουανική και
η Μαλτέζικη αντιµετωπίζουν τον µεγαλύτερο κίνδυνο εξαφάνισης σε µια
ευρωπαϊκή τεχνολογική κοινωνία, που
ολοένα περισσότερο προωθεί τη χρήση
συγκεκριµένων γλωσσών και ιδίως της
Αγγλικής. Όµως και άλλες γλώσσες,
όπως η Ελληνική, η Βουλγαρική, η Ουγγρική και η Πολωνική, επίσης κινδυνεύουν στον σύγχρονο ψηφιακό κόσµο.
Η έρευνα του ΜΕΤΑ-ΝΕΤ, στην οποία
συνέβαλαν περισσότεροι από 200 ειδικοί,
αξιολογεί τον κίνδυνο για κάθε γλώσσα
µε βάση τέσσερα βασικά κριτήρια σε
τεχνολογικό/ψηφιακό επίπεδο: την ύπαρξη αυτόµατης µετάφρασης στη συγκεκριµένη γλώσσα, τη δυνατότητα φωνητικής αλληλεπίδρασης, τη δυνατότητα
ψηφιακής ανάλυσης κειµένου και τη
διαθεσιµότητα των σχετικών ψηφιακών
γλωσσικών πόρων/πηγών.

Οι δυνατές
Η γλώσσα µε την καλύτερη βαθµολογία στα κριτήρια είναι ασφαλώς η
Αγγλική, που απολαµβάνει τη συγκριτικά
καλύτερη τεχνολογική υποστήριξη (αν
και όχι την καλύτερη δυνατή), γεγονός
που διευκολύνει την περαιτέρω εξάπλωσή της.

http://www.meta-net.eu

Ακολουθούν µε ικανοποιητική ή µέτρια τεχνολογική/ψηφιακή υποστήριξη
η Ολλανδική, η Γαλλική, η Γερµανική,
η Ιταλική και η Ισπανική. Η Ελληνική,
όπως επίσης η Βασκική, η Καταλανική,
η Πολωνική, η Ουγγρική κ.ά. κατατάσσονται στις γλώσσες µε «αποσπασµατική» µόνο υποστήριξη, γι’ αυτό
ακριβώς θεωρούνται γλώσσες υψηλού
κινδύνου προς εξαφάνιση.

Δραµατικές διαφορές
Σύµφωνα µε τους επιµελητές της µελέτης Χανς Ουζκοράιτ και Γκέοργκ Ρεµ,
«υπάρχουν δραµατικές διαφορές στην
υποστήριξη της γλωσσικής τεχνολογίας
ανάµεσα στις διάφορες ευρωπαϊκές
γλώσσες και τεχνολογικές περιοχές. Το
χάσµα µεταξύ ‘µικρών’ και ‘µεγάλων’
γλωσσών ολοένα και διευρύνεται. Πρέπει
να εξασφαλίσουµε τον εφοδιασµό των
µικρότερων και λιγότερο πλούσιων -σε
ψηφιακούς πόρους- γλωσσών µε τις
απαραίτητες βασικές τεχνολογίες, αλλιώς
οι γλώσσες αυτές είναι καταδικασµένες
σε ψηφιακή εξαφάνιση».
Ως ελπίδα αυτών των γλωσσών θεωρείται η βελτίωση και η ευρύτερη αξιοποίηση του λογισµικού γλωσσικής τεχνολογίας, το οποίο επιτρέπει τη φωνητική και τη γραπτή επεξεργασία των
διαφόρων γλωσσών.
Παραδείγµατα αυτών των δυνατοτήτων είναι οι ηλεκτρονικοί ορθογραφικοί
και συντακτικοί διορθωτές κειµένων,
οι διαδραστικοί προσωπικοί «βοηθοί»
των έξυπνων κινητών τηλεφώνων (π.χ.
η Siri στο iPhone), τα συστήµατα αυτόµατης µετάφρασης, τα ηλεκτρονικά
συστήµατα διαλόγου των τηλεφωνικών
κέντρων, οι µηχανές αναζήτησης, η
συνθετική φωνή στα συστήµατα πλοήγησης των αυτοκινήτων. κ.ά.

Το βασικό πρόβληµα
Το σηµαντικό, σύµφωνα µε την έκθεση, είναι όλες αυτές οι δυνατότητες
να προσφέρονται στους χρήστες και στη
µητρική τους γλώσσα που κινδυνεύει
µε εξαφάνιση. Χωρίς αποφασιστική δράση, γίνεται η δυσοίωνη πρόβλεψη ότι
οι γλώσσες αυτές δύσκολα θα επιβιώσουν
στον ψηφιακό κόσµο του 21ου αιώνα.
Ένα πρόβληµα είναι ότι το λογισµικό
αυτών των συστηµάτων γλωσσικής τεχνολογίας στηρίζεται σε στατιστικές µεθόδους που απαιτούν τεράστιες ποσότητες γραπτών ή φωνητικών δεδοµένων,
όµως τόσα πολλά δεδοµένα είναι δύσκολο
να αποκτηθούν για γλώσσες που οµιλούνται από σχετικά λίγους ανθρώπους.
Εξάλλου, ακόµα και για ευρέως χρησιµοποιούµενες γλώσσες όπως τα αγγλικά, η σχετική γλωσσική τεχνολογία
έχει ακόµα αδυναµίες, που είναι π.χ.
φανερές στις άκρως ανεπαρκείς και γεµάτες λάθη αυτόµατες µεταφράσεις. Η
έκθεση προτείνει ότι πρέπει να αναληφθεί
µια συντονισµένη µεγάλης κλίµακας
προσπάθεια στην Ευρώπη, προκειµένου
σταδιακά να δηµιουργηθούν ή να βελτιωθούν οι αναγκαίες τεχνολογίες και
να βοηθηθούν οι γλώσσες που είναι ψηφιακά παραγκωνισµένες.

22
Press Campaign: Highlights

http://www.meta-net.eu

23
Press Campaign: Highlights

http://www.meta-net.eu

24
Press Campaign: Highlights

http://www.meta-net.eu

25
Press Campaign: Highlights

http://www.meta-net.eu

26
Press Campaign: Highlights

http://www.meta-net.eu

27
Website: Visitors Overview

began sending European Day
out press release of Languages

unusually
high traffic

http://www.meta-net.eu

28
Website: Visitors’ Cities
City with the most
visits: Brussels!

http://www.meta-net.eu

29
The State of Computational Morphology for Europe’s Languages

Computational Morphology for
Europe’s Languages
http://www.meta-net.eu

30
Computational Morphology?
q 

So, what is the state of Computational Morphology support? Do we
have precise, good, reliable tools for all European languages?

q 

Answering this question is a non-trivial, difficult and complex task.

q 

However, we can provide a rough approximation.

q 

q 

In META-NET we had a look at 30 languages (Basque, Bulgarian,
Catalan, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish,
French, Galician, German, Greek, Hungarian, Icelandic, Irish,
Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish,
Portuguese, Romanian, Serbian, Slovak, Slovene, Spanish, Swedish).
We gathered data on several aspects that were used to prepare a
cross-language comparison, along with statistics, discussions,
comparisons, experts’ opinions, etc.

http://www.meta-net.eu

31
Coarse-Grained View
q 

We investigated four main areas: Machine Translation; Speech; Text
Analytics; Language Resources.

q 

Computational Morphology is covered by Text Analytics.

q 

Text Analytics comprises, among others,
§  the quality and coverage of existing text analytics technologies
(morphology, syntax, semantics),
§  coverage of linguistic phenomena and domains,
§  amount and variety of available applications,
§  quality and coverage of existing lexical resources and grammars.

http://www.meta-net.eu

32
Text Analytics

Coarse-Grained View
excellent

http://www.meta-net.eu

good
English

moderate
Dutch
French
German
Italian
Spanish

fragmentary
Basque
Bulgarian
Catalan
Czech
Danish
Finnish
Galician
Greek
Hungarian
Norwegian
Polish
Portuguese
Romanian
Slovak
Slovene
Swedish

weak or no support
Croatian
Estonian
Icelandic
Irish
Latvian
Lithuanian
Maltese
Serbian

33
Key Observations
q 

When it comes to Language Technology support, there are massive
differences between Europe’s languages and technology areas.

q 

LT support for English is ahead of any other language.

q 

Even support for English is far from being perfect.

q 

The gap between English and the other languages keeps widening!

q 

Several languages – Icelandic, Latvian, Lithuanian, Maltese – receive
this weakest score in all four areas!

http://www.meta-net.eu

34
Simplified Methodology
q 

Distributed data collection process in the respective countries.

q 

30 tables provide data for all languages (tools, resources, gaps etc.).

q 

Reduce numbers to one final score per language and area.

q 

Calibration of tables across languages in smaller groups.

q 

Final scores for each area and language were derived from two
central features (quality, coverage), resulting in one big table:
Basque

Bulgarian

Catalan

Croatian

Czech

Danish

Dutch

English

Estonian

Finnish

French

Galician

German

Greek

Hungarian

Icelandic

Irish

Italian

Latvian

Lithuanian

Maltese

Norwegian

Polish

Portuguese

Romanian

Serbian

Slovak

Slovene

Spanish

Swedish

Language Technology (Tools, Technologies, Applications)
Tokenization, Morphology (tokenization, POS tagging,
morphological analysis/generation)analysis)
Parsing (shallow or deep syntactic
Sentence Semantics (WSD, argument structure, semantic roles)
Text Semantics(coreferenceresolution, context, pragmatics,
inference) Discourse Processing (text structure, coherence,
Advanced
rhetorical structure/RST, argumentative zoning, argumentation,
Information Retrieval(text indexing, multimedia IR, crosslingual
IR)
Information Extraction (named entity recognition,
event/relation extraction, opinion/sentiment report generation,
Language Generation (sentence generation,recognition, text
text generation)
Summarization, Question Answering,advanced Information
Access Technologies
Machine Translation
Speech Recognition
Speech Synthesis
Dialogue Management (dialogue capabilities and user
modelling)

5
4
3,1
1
1
4
3
0
2
3,1
1
2,4
0

5
4
2,1
2
0
2
3
2
2
2
3
3
0

5
3
2
1,1
2
1,2
1,1
1,2
0
3,1
3
4
2,2

5
2
1,2
0
0
2,3
3,1
0,4
0,1
1,2
3
3,1
1

0
5
3,1
3
3
0
4,1
4
3
0
2,1
4
3,1

5
3,1
1,1
1
1
3
3
0
2,1
1,2
1,2
2,1
1

3,1
2,1
2,1
2
0
3
2,1
2,1
2,1
2,2
3,1
4
2,1

4,1
4,1
3,1
1,1
2
4,1
3,1
2
2
2,1
4
4,1
3,1

5
3,1
2
2
0
3
2
0
2
2,1
4
4
3

4
3,1
2
1
0
3
2
2,2
2
3
3
4
1,1

4
4
1,1
2,1
2
4,1
3,1
2
3
3,1
4
4
3

4,1
4,1
2,1
2,1
0
2
1,2
0
1,1
4,1
5
5
1

5
3
1,1
2,1
2,1
3
3
2
2
2,1
4
4,1
3,1

4
2,1
2
2
1
3,1
3
1,1
1,1
1
3,1
4,1
1,2

4,1
4
1,2
0,2
0
1,1
6
0
0
5
2,2
4
0

4,1
4
1,1
0
0
0
1
0
0
2
1,1
2,1
0

4,1
2
0
0
0
3,1
0
3
0
2,1
3,1
3,1
0

3,1
3,1
4
3
2
4,1
4,1
0
3
3,1
4,1
4
3

4,1
2,1
0
0
0
0
3
1,2
0
4
0
3,1
0

3
1,1
1,1
1
1
1,2
3
0
0,1
3
1,1
3
0

3,1
0
0
0
0
0
0
0
0
2,1
1
4
0

4,1
3,1
3,1
3
3
4
4
3,1
3,1
2,2
1,1
2,1
1,1

5
4
1,3
1,2
1
2
2
1
2
3
3,1
5,1
1

4,1
3,1
3,1
1,2
2
0
3,1
0
2,2
2,1
2,2
4
3

5
4
4
4,1
3,1
5
4,1
0
4,1
3,1
2,1
2
0

5
3,2
0
0
0
3
2
0
0,1
0,1
1
4
0

3,1
0
0
0
0
2,1
1
0
1
2
2
3
0

4,1
3,1
2,2
0
0
0
2,1
0
1,1
3,1
2,1
3,2
2,1

5
4
2,1
2
1
2
1,1
2
2,1
4,1
3,1
4
2

4,1
4,1
2
2,1
1
3,1
4
2,1
1
2,2
3,1
3
3

2,3
2,2
1
0
0
2,2
5
2
5,1
3,1
4
2

4,1
2,1
4,1
2
2,2
2,1
1
2
3,1
3
4,1
3

3,1
3
1
2
2,1
3,1
2
2,1
3,1
2
2,2
2,1

3,1
3,1
0
0
3
3
3,1
0
3,1
0
3,1
0

5
3,3
3,1
2,1
3,1
2,2
2,2
4
3,1
2,1
3,1
2,1

3,1
1,3
1,2
1,3
2,1
1,2
1,2
3
4
1,3
3
1,1

2,2
2,2
1,2
0
2,1
4,1
1,3
2,1
3,1
2,1
2,1
0

4,1
4,2
3
3
4
5,1
1,1
5
4,1
3
4,1
4

4
2,1
2
2,1
2,1
3,1
1
3
5
4
3,1
0

3,1
3,2
0
2,1
3
2,1
2,1
2
4
4
3,1
2,1

3,1
3
1,1
2
3,1
3,1
1,2
3
3,1
3
1,1
1,1

5
2
1
0
5
4,1
2,2
4,1
4,1
2
4
1

3,1
3
1,1
2
2
2,1
1,2
3
3,1
3
2,1
2,1

3
3,1
2,1
0
2
2,1
2,1
2,1
3
1
1,1
2

6
5,1
1,5
0
6
2,2
1
3,1
6
5,1
3,3
1

3,1
2,2
0
0
1,1
2
1
3
3
3
3
0

3,2
1,2
0
0
3,2
2,2
1,1
0
4
3
3,1
0

3
3
4
2,2
3,1
2,1
3,1
0
4,1
3
3,1
3,1

4,1
1
1
0
3,1
1
0
3,1
5
3,1
2,1
1

4
1
0
0
3,1
2
1
3,1
3,1
0
1
1,1

3
0
0
0
2,1
2,1
0
3
2,1
0
0
0

3
3,1
2,1
1,1
4,1
3,2
4,1
1
5
3,2
0
0

4
4
2,2
1,1
4
3
1
1
4
4
4
2,2

4,1
4
3,1
2
2,1
4
0
0
4,1
2,3
2,2
2

1,1
4,1
2,1
2,1
4,1
2,2
0
4
4,1
2,1
4
2

2,2
0
0
0
2,1
4
1,1
2,1
4
0,1
2,1
0,1

4,1
2
0
1,1
2
2
2,1
1,2
3,1
2,1
1,1
0

4,1
3,2
1,4
0
2,2
3,1
0
2,2
2,2
2,1
3
0

3,1
2
2
3
3,1
2,1
2
2
3
3
3
2

3,1
3
1
1
3,2
3
1
4
4,1
3
4,1
1

Language Resources (Resources, Data, Knowledge Bases)

Reference Corpora
Syntax-Corpora(treebanks, dependency banks)
Semantics-Corpora
Discourse-Corpora
Parallel Corpora, Translation Memories
Speech-Corpora (raw speech data, labelled/annotated speech
data, speech dialogue data)
Multimedia and multimodal data
Language Models
Lexicons, Terminologies
Grammars
Thesauri, WordNets
Ontological Resources for World Knowledge (e.g. upper
models, Linked Data)

http://www.meta-net.eu

35
Simplified LR/LT Table (German)

0: very low
6: very high

http://www.meta-net.eu

36
Text Analytics

Coarse-Grained View
excellent

good

moderate

fragmentary

English (4.50)

Dutch (3.94)
French (3.71)
German (3.36)
Italian (3.50)
Spanish (3.77)

Basque (3.36)
Bulgarian (2.80)
Catalan (3.21)
Czech (3.29)
Danish (3.00)
Finnish (3.64)
Galician (3.43)
Greek (2.71)
Hungarian (3.79)
Norwegian (4.36)
Polish (4.07)
Portuguese (3.64)
Romanian (3.87)
Slovak (2.43)
Slovene (3.57)
Swedish (4.57)

weak or no support
Croatian (2.43)
Estonian (3.14)
Icelandic (3.50)
Irish (3.71)
Latvian (3.14)
Lithuanian (1.79)
Maltese (0.80)
Serbian (1.64)

In parenthesis: average scores of the grammatical analysis feature.
Several additional categories and features informed and influenced the overall ranking of a language in
one of the five categories. Neither the individual scores nor the avg. scores have been calibrated with
regard to the scores assigned to the LT support of other languages. These scores cannot be used for a
cross-language comparison alone; nevertheless, the avg. scores show how the authoring teams perceive
the state of the grammatical analysis category for their respective language themselves.
37
“Grammatical Analysis” Feature
Language	
  
Basque	
  
Bulgarian	
  
Catalan	
  
CroaBan	
  
Czech	
  
Danish	
  
Dutch	
  
English	
  
Estonian	
  
Finnish	
  
French	
  
Galician	
  
German	
  
Greek	
  
Hungarian	
  
Icelandic	
  
Irish	
  
Italian	
  
Latvian	
  
Lithuanian	
  
Maltese	
  
Norwegian	
  
Polish	
  
Portuguese	
  
Romanian	
  
Serbian	
  
Slovak	
  
Slovene	
  
Spanish	
  
Swedish	
  

QuanBty	
  
4	
  
2.4	
  
3	
  
2	
  
4	
  
3	
  
3.6	
  
5	
  
2.5	
  
3.5	
  
4	
  
3	
  
4	
  
2	
  
4.5	
  
2	
  
4	
  
3.5	
  
2.5	
  
2	
  
0.8	
  
4	
  
4	
  
3	
  
4	
  
1	
  
2	
  
2.5	
  
3.5	
  
4.5	
  

http://www.meta-net.eu

Availability	
  
2.5	
  
2	
  
2.5	
  
1.5	
  
2	
  
2	
  
5.4	
  
5	
  
3.5	
  
3.5	
  
4	
  
5	
  
2.5	
  
1.5	
  
2	
  
5.5	
  
4	
  
3	
  
2	
  
1.5	
  
0.8	
  
4.5	
  
4.5	
  
3	
  
3.5	
  
1	
  
2	
  
4	
  
3	
  
3.5	
  

Quality	
  
4	
  
3.6	
  
4	
  
3.5	
  
4	
  
4	
  
4.8	
  
5.5	
  
3.2	
  
3.5	
  
4	
  
4	
  
4	
  
3.5	
  
4	
  
4	
  
3	
  
4	
  
3	
  
2.5	
  
0.8	
  
4	
  
4.5	
  
4	
  
4	
  
2.5	
  
3	
  
4.5	
  
5.4	
  
5	
  

Coverage	
  
4	
  
3.6	
  
4	
  
3	
  
4	
  
4	
  
3.6	
  
4.5	
  
2.8	
  
4	
  
4	
  
4	
  
4	
  
3	
  
4.5	
  
3	
  
3	
  
5	
  
3.5	
  
2	
  
0.8	
  
4	
  
4.5	
  
4	
  
3.6	
  
2	
  
2	
  
3.5	
  
4.5	
  
4	
  

Maturity	
  
4	
  
2.8	
  
4	
  
2	
  
3	
  
3	
  
4.8	
  
4.5	
  
4	
  
4	
  
4	
  
3	
  
4	
  
3	
  
4	
  
3.5	
  
4	
  
4	
  
4	
  
1.5	
  
0.8	
  
4.5	
  
4	
  
4.5	
  
4.5	
  
2	
  
2	
  
3	
  
3.5	
  
5	
  

Sustainability	
   Adaptability	
  
2.5	
  
2.4	
  
2.5	
  
1	
  
2	
  
2	
  
3.6	
  
3	
  
2.5	
  
3.5	
  
3	
  
2	
  
2.5	
  
3	
  
3	
  
3.5	
  
4	
  
3	
  
3	
  
1	
  
0.8	
  
4.5	
  
4	
  
2.5	
  
3.5	
  
1.5	
  
3	
  
3	
  
3	
  
5	
  

2.5	
  
2.8	
  
2.5	
  
4	
  
4	
  
3	
  
1.8	
  
4	
  
3.5	
  
3.5	
  
3	
  
3	
  
2.5	
  
3	
  
4.5	
  
3	
  
4	
  
2	
  
4	
  
2	
  
0.8	
  
5	
  
3	
  
4.5	
  
4	
  
1.5	
  
3	
  
4.5	
  
3.5	
  
5	
  

Average	
  
3.36	
  
2.80	
  
3.21	
  
2.43	
  
3.29	
  
3.00	
  
3.94	
  
4.50	
  
3.14	
  
3.64	
  
3.71	
  
3.43	
  
3.36	
  
2.71	
  
3.79	
  
3.50	
  
3.71	
  
3.50	
  
3.14	
  
1.79	
  
0.80	
  
4.36	
  
4.07	
  
3.64	
  
3.87	
  
1.64	
  
2.43	
  
3.57	
  
3.77	
  
4.57	
  

Level	
  of	
  support	
  
(Text	
  AnalyBcs)	
  
fragmentary	
  
fragmentary	
  
fragmentary	
  
weak/no	
  
fragmentary	
  
fragmentary	
  
moderate	
  
good	
  
weak/no	
  
fragmentary	
  
moderate	
  
fragmentary	
  
moderate	
  
fragmentary	
  
fragmentary	
  
weak/no	
  
weak/no	
  
moderate	
  
weak/no	
  
weak/no	
  
weak/no	
  
fragmentary	
  
fragmentary	
  
fragmentary	
  
fragmentary	
  
weak/no	
  
fragmentary	
  
fragmentary	
  
moderate	
  
fragmentary	
  

38
Across Categories
q 

q 

q 

q 

The four area rankings of the 30 languages on the five point scale
(from “excellent support” to “weak/no support”) take many different
features and factors into account.
The “grammatical analysis” data are only one single piece of the
puzzle – the piece that is closest to Computational Morphology.
Let’s have a look at the individual White Papers and the languages as
they are ranked – from “good support” to “weak/no” support.
The following ranking is in terms of Text Analytics, the excerpts
taken from the White Papers refer to morphological tools.

http://www.meta-net.eu

39
Good Support
q 

q 

q 
q 

Only language that is considered to have “good support” in terms of
Text Analytics is English.
In comparison to certain other languages and language families, the
morphology of English is usually considered as being rather simple
and straight-forward.
Many robust and precise off-the-shelf technologies exist.
This is most probably the main reason why the authors of the white
paper on English do not discuss morphology components at all, nor
any related issues or challenges.

http://www.meta-net.eu

40
Moderate Support
q 
q 

q 

q 

Same trend in this category concerning morphological tools.
Authors mainly discuss other research and technology gaps,
mentioning the existence of, for example, “medium- to high-quality
software for basic text analysis, such as tools for morphological
analysis and syntactic parsing” (German),
Some authors mention morphology on a more superficial level
(Italian, Spanish) or not at all (Dutch).
The authors of the white paper on French emphasise that large
programmes were set up (1994–2000; 2003–2005) to build a set of
basic technologies for French, from spoken and written language
resources to spoken and written language processing systems.

http://www.meta-net.eu

41
Fragmentary Support 1/4
q 
q 

q 

q 
q 

16 languages only have fragmentary support in Text Analytics.
The respective authoring teams report the existence of one or two
morphological tools per language.
Clear tendency: these tools have limited functionality and a long
history including an unclear copyright situation (Hungarian).
Neither freely nor immediately available (Danish, Romanian).
However, these tools are usually employed in the large office suites
(MS Office, Open Office), localisation frameworks or national search
engines (Norwegian, Czech, Slovak).

http://www.meta-net.eu

42
Fragmentary Support 2/4
q 

q 

q 

q 

Key contributing factor that only few morphological components
exist: rich morphological systems; high degree of inflection; lack of
morphological distinction for certain nominal cases.
These linguistic properties make morphological processing, as well
as all approaches based primarily on statistics, a challenge (Basque,
Polish, Slovene and other languages).
Special characters and encoding systems are mentioned for
languages with alphabets that go beyond plain ASCII: processing
words when diacritics are missing (web, email) is a challenge.
Experts demand more robust error detection algorithms (Czech).
Important observation (Basque, Greek): algorithms and approaches
developed for English cannot be directly transferred to other
languages.

http://www.meta-net.eu

43
Fragmentary Support 3/4
q 

q 

q 

Languages spoken in smaller countries usually do not receive as
much attention and research funding as larger languages in which
typically also a larger base of researchers works on building actual
technologies, maybe even breaking new ground (Greek).
Hungarian: a lack of synchronisation between parallel efforts to
build morphological processors lead to substantial friction loss. This
is why several morphological parsers for Hungarian exist but they
use conflicting and incompatible formalisms.
Some authors discuss related technologies such as, for example, elearning tools and systems for second language learners that employ
complex morphological components (Czech).

http://www.meta-net.eu

44
Fragmentary Support 4/4
q 

q 

q 

q 

Portugal set up a project in 2005 to enable the development of a set
of linguistic resources and components to support the processing of
Portuguese. Outcome: large corpus and tools for tokenisation,
morphosyntactic tagging, inflection analysis, and lemmatisation.
Slovakia set up a project to provide processing of Slovak for
linguistic research purposes within the National Research and
Development Programme. Outcome: tools and data sets that include
processors and morphologically annotated corpora.
French (1994-2000) had a clear head-start over Portuguese and
Slovak in addition to a longer, more established research tradition in
this area, which is why it was ranked higher.
The Slovak experts conclude that, while certain morphological tools
do exist, “those must be further developed and supported.”

http://www.meta-net.eu

45
Weak or No Support 1/2
q 
q 

q 

q 

q 

This category concerns eight languages.
A small or very small number of morphological tools or components
exist (Irish) and are used, even in well known applications, but they
are neither freely available nor accessible for research purposes.
Tools are based on very simple approaches that rely on word lists
(Lithuanian, Estonian, Croatian).
Several of these tools have been in development since the 1980ies
and are under the control of companies. Researchers often use ispell
or aspell (open source) as a technological fallback solution.
The complex morphology of languages is mentioned in almost all
cases along with the statement that morphology processing must be
further developed (Icelandic, Estonian, Croatian, Maltese, Serbian).

http://www.meta-net.eu

46
Weak or No Support 2/2
q 
q 

q 

q 

Authors demand more development for basic morphological tools.
Perceived as very important: to design and model approaches to the
specific linguistic properties of a language without trying to adapt an
approach developed for English (Serbian, Estonian).
One such step is to set up specific language technology programmes,
as has been done, among others, in France, Slovakia and Portugal.
In 2000, the Icelandic government set up a national programme
with the aim of supporting institutions and companies in creating
resources for Icelandic. Outcome: several projects, huge impact on
the national field. Among its results are a full-form morphological
database of Modern Icelandic inflections, a balanced
morphosyntactically tagged corpus and a training model for datadriven POS taggers and an improved spell checker.

http://www.meta-net.eu

47
Summary
q 

q 
q 

q 

q 
q 

q 

Solid computational morphology tools only exist for a handful of
European languages – i.e., those with many speakers (and funding).
The smaller the language, the less tools exist.
“Fragmentary” or “weak/no support”: 24 of the 30 languages – very few
tools; very limited functionality; availability is a problem.
In terms of the full NLP stack, computational morphology cannot be
taken for granted, it is by no means a “solved problem”.
More original research off the beaten track (i.e., English) needed.
More coordination, synergies and research transfer between the
languages needed.
France, Iceland, Portugal, Slovakia show that large, dedicated funding
programmes are needed to support the development of basic LRs/LTs.

http://www.meta-net.eu

48
The META-NET Strategic Research Agenda for Multilingual Europe

Strategic Research Agenda

http://www.meta-net.eu

49
Three Ingredients

Appropriate
Actors

Appropriate
Programme

Research &
Commercialisation

Vision & Agenda

Appropriate
Support
Funding

http://www.meta-net.eu

50
Three Vision Groups
q 

Translation and Localisation (technical documentation, official
bulletins, GUI localisation, games, services etc.)
§  Target stakeholders: large users of translation services, (machine)
translation, software companies, game companies, localisation industry

q 

Media and Information Services (audiovisual sector, news,
digital libraries, portals, search engines etc.)
§  Target stakeholders: media industries, search engine providers, archives

q 

Interactive Systems (mobile assistance, dialogue translation, call
centres, etc.)
§  Target stakeholders: mobile software and service providers, telecom
industry, call centres

http://www.meta-net.eu

51
Vision Group Meetings
q 

Vision Group Translation and Localisation
§  July 23, 2010
§  September 28, 2010
§  April 7/8, 2011

q 

Vision Group Media and Information Services
§  September 10, 2010
§  October 15, 2010
§  April 1, 2011

q 

Berlin, Germany
Brussels, Belgium
Prague, Czech Republic

Paris, France
Barcelona, Spain
Vienna, Austria

Vision Group Interactive Systems
§  September 10, 2010
§  October 5, 2010
§  March 28, 2011

http://www.meta-net.eu

Paris, France
Prague, Czech Republic
Rotterdam, The Netherlands

52
Planning Process
Expert meeting
minutes
Expert meeting
minutes
Vision Group
Media and
Information
Services Report
Vision Group
Interactive
Systems Report

Expert meeting
minutes

Vision
Paper

Strategic
Research
Agenda

Vision Group
Translation and
Localisation
Report

2010

Priority
Themes
Paper

2011

2012
Planning Process: Documents
Expert meeting
minutes

This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”,
co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119.

LT 2020

A Network of Excellence forging the

Vision and Priority Themes for
Language Technology Research
in Europe until the Year 2020

Multilingual Europe Technology Alliance

Expert meeting
minutes

Vision Document

Vision Group Translation and Localisation
Results of first two meetings

Editors:
Dissemination Level:

Public

Date:

Towards the META-NET Strategic Research Agenda

Aljoscha Burchardt, Georg Rehm

3 December 2010

Expert meeting
minutes

Vision Group
Media and
Information
Services Report

This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”,
co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119.

A Network of Excellence forging the

Priority
Themes
Paper

Do you
have com
with reg
me
ard to the nts, ideas or
sugges
Please
conten
tions
send the
t of this
discuss
docum
them onl m to office@
ent?
meta-n
ine: htt
et.eu or
p://ww
w.meta
-net.eu
/sra.

The development of this paper has been funded by the Seventh Framework Programme and the ICT Policy Support Programme of the European Commission under contracts T4ME (Grant Agreement 249119), CESAR (Grant Agreement 271022), METANET4U (Grant Agreement
270893) and META-NORD (Grant Agreement 270899).

Multilingual Europe Technology Alliance

Vision Document

Vision Group Interactive Systems:
Results of first two meetings

www.meta-net.eu
office@meta-net.eu
T: +49 30 23895 1833

Editors:

Joseph Mariani, Bernardo Magnini

Dissemination Level:

Public

Date:

28 December 2010

Vision Group
Interactive
Systems Report

This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”,
co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119.

A Network of Excellence forging the

Multilingual Europe Technology Alliance

The Future European Multilingual
Information Society

Vision
Paper

Vision Paper for a Strategic Research Agenda

Vision Document

Vision Group Media and Information Services:
Results of first two meetings

Editors:

Maria Koutsombogera, Stelios Piperidis

Dissemination Level:

Public

Date:

10 November 2010

Vision Group
Translation and
Localisation
Report

2010

“People can’t share knowledge
if they don’t speak a common language.”
Davenport, Thomas H, and Laurence Prusak, Working Knowledge:
How Organizations Manage What They Know, Harvard Business School,
Boston, 1997, p. 98.

Join the discussion at
www.meta-et.eu/forum

2011

Strategic
Research
Agenda
2012
Preparation of the SRA
q 
q 
q 

Strategic Research Agendas of other initiatives were screened.
Many suggestions as input from Vision Group members.
We discussed procedures, input and structure of the SRA in four
meetings of the META Technology Council.
§  Brussels, Belgium, November 16, 2010
§  Venice, Italy, May 25, 2011
§  Berlin, Germany, September 30, 2011
§  Brussels, Belgium, June 19, 2012

q 

Additional input in talks, meetings, workshops, discussions, etc.
§  Example: Three HLT Expert Meetings organised by the EC (end of 2011)

q 

Almost 200 experts contributed to the SRA (54% from industry;
46% from research; 4% from national/international institutions).

http://www.meta-net.eu

55
Strategic Research Agenda
q 

q 

q 

q 
q 

Addresses the problems we identified
when preparing the white papers.
Three priority research themes and
application/innovation scenarios.
Can put Europe ahead of its
competitors in this technology area.
>190 contributors; >2 years.
Presented and discussed at 83
conferences and major workshops.

q 

Final version ready on Dec. 1, 2012.

q 

http://www.meta-net.eu/sra

http://www.meta-net.eu

56
SRA: Contents – Brief Glimpse
q 

q 

Set the stage and describe the European situation, the needs and the LT
research and industry.
Discuss the state of IT, predictions
and mega-trends.

q 

Our technology vision for 2020.

q 

Select and specify priority themes.

q 

q 

Suggest a model for speeding up
innovation.
Outline proposals for the organisation
of research and innovation.

http://www.meta-net.eu

57
Priority Themes: 3 + 2
q 

We decided on priority themes that (a) support technology progress,
(b) lead to solutions that European society needs and (c) solutions
from which European industry will benefit as users or as providers.
§  Translingual Cloud
§  Social Intelligence and e-Participation
§  Socially-Aware Interactive Assistants

q 

Two additional themes:
§  European Service Platform for
Language Technologies
§  Core Technologies for Language
Analysis and Production

http://www.meta-net.eu

58
PT1: Translingual Cloud
q 
q 
q 

Europe has a big need for translations of publishable quality.
Focus on high-quality translation.
New research paradigms
§  Inclusion of professional translators into the
research process
§  Inclusion of technologists into research on
human translation processes

q 

Different technological approaches
§  Stronger emphasis on the properties of
individual languages
§  A central role for semantics

q 

Methods for specific genres & domains

http://www.meta-net.eu

59
Priority Research Theme 1: Translingual Cloud
Written (twitter, blog, article, newspaper,
text with/without metadata etc.) or
spoken input (spontaneous spoken
language, video/audio, multiple speakers)

Extending
translation with
semantic data and
linked open data

Modular combination
of analysis, transfer
and generation
models

From very fast but lower
quality to slower but very
high quality (including
instant quality upgrades)

Services and Technologies:
Automatic translation and
interpretation
Language checking
Post-editing
Workbenches for creative
translations
Novel translation and authoring
workflows

Quality assurance
Computer-supported human
translation
Multilingual content production and
text authoring
Trusted service centre (privacy,
confidentiality, security of source
data)

Exploiting strong
monolingual analysis
and generation methods
and resources

Multiple target
formats
Domain, task and
genre specialisation
models

Applications:
Crosslingual communication,
translation and search
Real-time subtitling, voice-over
generation and translating speech
from live events
Mobile interactive interpretation

Any
device

Target groups: European citizen, language
professional, organisations, companies, European
institutions, software applications

Multilingual content production
(media, web, technical, legal
documents)
Showcases: translingual spaces for
ambient translation

Multiple target
formats

Single access
point
PT2: Social Intelligence
q 
q 
q 
q 
q 

q 

q 
q 

Better decisions by monitoring social media
Inclusion of citizens into collective decision processes
Opinion formation, consensus building, decision making
Evolution of new solutions
New forms of democracy: e-democracy,
massive participation, transparency
Dialogues and debates across language
boundaries and across parties, political
alliances, social classes
Better than binary voting
Documented transparent
decision processes

http://www.meta-net.eu

61
Priority Research Theme 2: Social Intelligence and e-Participation
Mapping large, heterogeneous,
unstructured volumes of online
content to structured, actionable
representations

From shallow to deep,
from coarse-grained to
detailed processing
techniques

Making language
technologies interoperable
with knowledge representation and the semantic web

“Semantification” of the
web: tight integration
with the Semantic Web
and Linked Open Data

Services and Technologies:
especially social media, comments,
blogs, forums
decision-relevant information

sentiment analysis and opinion mining
including the temporal dimension)

cues
from arbitrary online content
visualising discussions and opinion
statements

support

Applications:
Make use of the
wisdom of the
crowds

and processes; modeling evolution of
opinions

collective deliberation and
e-participation
wide deliberation on pressing issues

Unleashing social intelligence by
detecting and monitoring opinions,
demands, needs and problems

-

analysis technologies

Target groups: European citizen,
European institutions, discussion
participants, companies

Improved
efficiency and
quality of decision
processes

Understanding influence
diffusion across social media
Priority Research Theme 3: Socially-Aware Interactive Assistants
ments, any
vocabulary

recovery,
selfassessment

Multilingual
capabilities

Interacting
naturally
with and in
groups

Include human-computer,
human-artificial agent and
computer-mediated humanhuman communication

Learning
and
forgetting
information

Adaptable to the
user’s needs and
preferences and
the environment

Services and Technologies:
recognition

understanding

inter-dependencies

and synthesis, providing expressive
voices

incremental conversational speech

priority themes

models of human communication

Applications:
dialogue systems

modalities (visual, tactile, haptic)

environment

Proactive,
self-aware,
user-adaptable

Interacts naturally with
humans, in any
language and modality

Can be personalised to
individual communication
abilities including special needs

verbal/non-verbal behaviour, social
context

Can learn incrementally
from all interactions and
other sources of information
Providers of operational and research technologies and services
National
Language
Institutions

Language
Service
Providers

Priority Research Theme 1:
Translingual
Cloud

Language
Processing

Language
Technology
Providers

Universities

Priority Research Theme 2:
Social Intelligence
& e-Participation

European
Institutions

Priority Research Theme 3:
Socially Aware
Interactive Assistants

European Service Platform for Language Technologies
(Cloud or Sky Computing Platform)

Language
Understanding

Text
analytics

Multilingual
technologies

Text
generation

Information and
relation extraction

Knowledge
Emotion/
Sentiment

Language
checking

Sentiment
analysis

Named entity
recognition

Other
companies (SMEs,
startups etc.)

Summarisation

Knowledge access
and management

Data protection
Tools
Data Sets
Resources
Components
Metadata
Standards
Interfaces
APIs
Catalogues
Quality Assurance
Data Import/Export
Input/Output
Storage
Performance
Availability
Scalability

Interfaces (web, speech, mobile etc.)

Beneficiaries/users of the platform
European
Institutions

Research
Centres

Public
Administrations

European
Citizens

Enterprises

LT User
Industries

Universities

Features

Research
Centres
Core Resources & Technologies

Icelandic

Icelandic

Finnish

Finnish
Norwegian
Norwegian

Estonian

Swedish

Estonian
Swedish

Lithuanian

Danish

Irish

Latvian

Polish

Latvian

Lithuanian

Danish

Irish

Slovak
English

English

Polish

Dutch
German

Dutch

Romanian

Slovak
Czech

German
Galician

Hungarian

Slovene
Croatian
Basque

Portuguese

Croatian

French

Serbian

Basque

Serbian
Catalan

Hungarian

Slovene

Romanian

French
Galician

Czech

Bulgarian

Bulgarian

Italian

Catalan
Portuguese

Spanish
Greek

Spanish

Italian
Greek

Maltese

http://www.meta-net.eu

Maltese

65
META-NET

Conclusions and Next Steps

http://www.meta-net.eu

66
Conclusions and Next Steps
q 

q 

q 

q 

q 

q 

The White Paper Series clearly shows that Computational Morphology
cannot and must not be considered a “solved problem”.
Quite the contrary: several good technologies exist only for a small
number of languages; many languages lack adequate support.
The research community needs to team up to discuss synergies and to
boost research and technology transfer between its languages.
The goal should be adequate, precise, robust, scalable and freely
available morphology components for all European languages.
New challenges and opportunities: real-time processing, web-scale
processing of and training on documents using big data technologies
such as Hadoop, interoperability and standardisation of data formats,
morphology as a service etc.
The sophisticated applications foreseen in our META-NET SRA are
critically dependent on reliable and precise basic processing
components — including computational morphology!

http://www.meta-net.eu

67
Conclusions and Next Steps
q 
q 

q 

q 

q 

q 
q 

q 

Europe is extremely interested in and passionate about its languages.
Our Strategic Research Agenda for LT research and innovation can put
Europe ahead of its competitors in this technology area.
Provides useful and attractive solutions to European society, at the same
time creating huge business opportunities for European industry.
Now is the time to move forward with a continent-wide, systematic push
and to invest in strategic research. A modest investment is required.
We are very confident that we can help build applications that break
down language barriers in Europe and beyond.
This push will generate a countless number of opportunities.
This year is important: H2020 and CEF can provide sufficient resources
to make our visions for Europe’s citizens and economy a reality.
META-FORUM 2013, September 19/20, Berlin, Germany.

http://www.meta-net.eu

68
http://www.meta-net.eu

Connecting Europe for New Horizons

Vision Group

Translation and Localisation
Vision Group

Interactive Systems

Economics and Technology — Berlin, Germany

2010

http://www.meta-forum.eu
Register now!

META-NET Website

Vision Group

Media and Information Services

META-FORUM 2013 — Connecting Europe for New Horizons is an
international conference on powerful language technologies for the multilingual
information society, the data value chain and the information market place. The
two special themes of this year's edition of the conference are Big Data Text
Analytics and Multilingual Web Services for Multilingual Europe.

2011

Highlights
Keynote lectures by Daniel Marcu (Chief Science Officer, SDL) and
Wolfgang Wahlster (CEO, German Research Center for Artificial Intelligence, DFKI)
Horizon 2020 and Connecting Europe Facility (CEF): Current State of Play
Dynamic Discussions on:
Technologies for the Multilingual Web
MT for Professionals
Services for Multilingual Europe
Needs of Europe's Languages
Connecting Towards New Horizons
Quality Translation and Innovation
New Stakeholders: GALA (Globalization and Localization Association); NPLD
(Network to Promote Linguistic Diversity); Council of Europe Committee of Experts on the
Charter of Regional and Minority Languages
Towards a European Language Technology Platform
Panel discussions
Awards Ceremony: META Prize and META Seal of Recognition
META Exhibition (industry and research exhibition – software demos and posters)

2012
Language White Paper Series

2013
Strategic
Research
Agenda

META-FORUM 2013 will be held jointly by META-NET and the German Federal Ministry of Economics and Technology, co-organised with MultilingualWeb-LT, QTLaunchPad and LT Berlin.

Horizon 2020

Conne
Deliverin

2014-2020
Transport
Energy
Connect

http://www.meta-net.eu
Q/A
Acknowledgements: This work would not have been possible
without the dedication and commitment of our colleagues
Aljoscha Burchardt, Kathrin Eichler, Tina Klüwer, Arle Lommel,
Felix Sasaki and Hans Uszkoreit (all DFKI), the 60 member
organisations of the META-NET network of excellence, the ca.
70 members of the Vision Groups, the ca. 30 members of the
META Technology Council, the more than 200 authors of and
contributors to the META-NET Language White Paper Series
and the ca. 200 representatives from industry and research who
contributed to the META-NET Strategic Research Agenda.

Thank you!

META-FORUM 2013
September 19/20, Berlin
http://www.meta-forum.eu
http://www.meta-net.eu
http://www.facebook.com/META.Alliance

Connecting Europe for New Horizons
Economics and Technology — Berlin, Germany

http://www.meta-forum.eu
Register now!
META-FORUM 2013 — Connecting Europe for New Horizons is an
international conference on powerful language technologies for the multilingual
information society, the data value chain and the information market place. The
two special themes of this year's edition of the conference are Big Data Text
Analytics and Multilingual Web Services for Multilingual Europe.
Highlights
Keynote lectures by Daniel Marcu (Chief Science Officer, SDL) and
Wolfgang Wahlster (CEO, German Research Center for Artificial Intelligence, DFKI)
Horizon 2020 and Connecting Europe Facility (CEF): Current State of Play
Dynamic Discussions on:
Technologies for the Multilingual Web
MT for Professionals
Services for Multilingual Europe
Needs of Europe's Languages
Connecting Towards New Horizons
Quality Translation and Innovation
New Stakeholders: GALA (Globalization and Localization Association); NPLD
(Network to Promote Linguistic Diversity); Council of Europe Committee of Experts on the
Charter of Regional and Minority Languages
Towards a European Language Technology Platform
Panel discussions
Awards Ceremony: META Prize and META Seal of Recognition
META Exhibition (industry and research exhibition – software demos and posters)

META-FORUM 2013 will be held jointly by META-NET and the German Federal Ministry of Economics and Technology, co-organised with MultilingualWeb-LT, QTLaunchPad and LT Berlin.

http://www.meta-net.eu

70

Más contenido relacionado

La actualidad más candente

The META-NET Language White Paper Series
The META-NET Language White Paper SeriesThe META-NET Language White Paper Series
The META-NET Language White Paper SeriesGeorg Rehm
 
META-NET: Language Technology for Europe
META-NET: Language Technology for EuropeMETA-NET: Language Technology for Europe
META-NET: Language Technology for EuropeGeorg Rehm
 
Cracking the Language Barrier for a Multilingual Europe
Cracking the Language Barrier for a Multilingual EuropeCracking the Language Barrier for a Multilingual Europe
Cracking the Language Barrier for a Multilingual EuropeGeorg Rehm
 
Language Resources for Multilingual Europe
Language Resources for Multilingual EuropeLanguage Resources for Multilingual Europe
Language Resources for Multilingual EuropeGeorg Rehm
 
A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...
A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...
A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...Georg Rehm
 
Multilingualism for Digital Europe
Multilingualism for Digital EuropeMultilingualism for Digital Europe
Multilingualism for Digital EuropeGeorg Rehm
 
Towards a Human Language Project for Multilingual Europe: AI and Interpretation
Towards a Human Language Project for Multilingual Europe: AI and InterpretationTowards a Human Language Project for Multilingual Europe: AI and Interpretation
Towards a Human Language Project for Multilingual Europe: AI and InterpretationGeorg Rehm
 
The Strategic Agenda for the Multilingual Digital Single Market V0.9
The Strategic Agenda for the Multilingual Digital Single Market V0.9The Strategic Agenda for the Multilingual Digital Single Market V0.9
The Strategic Agenda for the Multilingual Digital Single Market V0.9Georg Rehm
 
The META-NET Strategic Research Agenda and Linked Open Data
The META-NET Strategic Research Agenda and Linked Open DataThe META-NET Strategic Research Agenda and Linked Open Data
The META-NET Strategic Research Agenda and Linked Open DataGeorg Rehm
 
Language Technologies for Multilingual Europe - Towards a Human Language Proj...
Language Technologies for Multilingual Europe - Towards a Human Language Proj...Language Technologies for Multilingual Europe - Towards a Human Language Proj...
Language Technologies for Multilingual Europe - Towards a Human Language Proj...Georg Rehm
 

La actualidad más candente (11)

The META-NET Language White Paper Series
The META-NET Language White Paper SeriesThe META-NET Language White Paper Series
The META-NET Language White Paper Series
 
META-NET: Language Technology for Europe
META-NET: Language Technology for EuropeMETA-NET: Language Technology for Europe
META-NET: Language Technology for Europe
 
Cracking the Language Barrier for a Multilingual Europe
Cracking the Language Barrier for a Multilingual EuropeCracking the Language Barrier for a Multilingual Europe
Cracking the Language Barrier for a Multilingual Europe
 
Language Resources for Multilingual Europe
Language Resources for Multilingual EuropeLanguage Resources for Multilingual Europe
Language Resources for Multilingual Europe
 
A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...
A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...
A Strategic Research and Innovation Agenda for the Multilingual Digital Singl...
 
Multilingualism for Digital Europe
Multilingualism for Digital EuropeMultilingualism for Digital Europe
Multilingualism for Digital Europe
 
Towards a Human Language Project for Multilingual Europe: AI and Interpretation
Towards a Human Language Project for Multilingual Europe: AI and InterpretationTowards a Human Language Project for Multilingual Europe: AI and Interpretation
Towards a Human Language Project for Multilingual Europe: AI and Interpretation
 
Dacco
DaccoDacco
Dacco
 
The Strategic Agenda for the Multilingual Digital Single Market V0.9
The Strategic Agenda for the Multilingual Digital Single Market V0.9The Strategic Agenda for the Multilingual Digital Single Market V0.9
The Strategic Agenda for the Multilingual Digital Single Market V0.9
 
The META-NET Strategic Research Agenda and Linked Open Data
The META-NET Strategic Research Agenda and Linked Open DataThe META-NET Strategic Research Agenda and Linked Open Data
The META-NET Strategic Research Agenda and Linked Open Data
 
Language Technologies for Multilingual Europe - Towards a Human Language Proj...
Language Technologies for Multilingual Europe - Towards a Human Language Proj...Language Technologies for Multilingual Europe - Towards a Human Language Proj...
Language Technologies for Multilingual Europe - Towards a Human Language Proj...
 

Similar a Computational Morphology and the META-NET Strategic Research Agenda for Multilingual Europe 2020

The Strategic Impact of META-NET on the Regional, National and International ...
The Strategic Impact of META-NET on the Regional, National and International ...The Strategic Impact of META-NET on the Regional, National and International ...
The Strategic Impact of META-NET on the Regional, National and International ...Georg Rehm
 
AI and Conference Interpretation – From Smart Assistants for the Human Interp...
AI and Conference Interpretation – From Smart Assistants for the Human Interp...AI and Conference Interpretation – From Smart Assistants for the Human Interp...
AI and Conference Interpretation – From Smart Assistants for the Human Interp...Georg Rehm
 
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...TAUS - The Language Data Network
 
AI for Translation Technologies and Multilingual Europe
AI for Translation Technologies and Multilingual EuropeAI for Translation Technologies and Multilingual Europe
AI for Translation Technologies and Multilingual EuropeGeorg Rehm
 
ELSE IF 2019: What’s next for Multilingual Europe?
ELSE IF 2019: What’s next for Multilingual Europe?ELSE IF 2019: What’s next for Multilingual Europe?
ELSE IF 2019: What’s next for Multilingual Europe?PretaLLOD
 
Celtic language technologies in the digital age
Celtic language technologies in the digital ageCeltic language technologies in the digital age
Celtic language technologies in the digital agetechiaith
 
Promoting the Use of Basque via Language Technology
Promoting the Use of Basque via Language TechnologyPromoting the Use of Basque via Language Technology
Promoting the Use of Basque via Language Technologytechiaith
 
Is MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
Is MT ready for e-Government? The Latvian Story. Indra Samite, TildeIs MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
Is MT ready for e-Government? The Latvian Story. Indra Samite, TildeABBYY Language Serivces
 
Why the Baltics are a prime region for driving innovation in language technol...
Why the Baltics are a prime region for driving innovation in language technol...Why the Baltics are a prime region for driving innovation in language technol...
Why the Baltics are a prime region for driving innovation in language technol...TAUS - The Language Data Network
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana
 
TAUS MT Showcase, MT@EC for European public administrations and online servic...
TAUS MT Showcase, MT@EC for European public administrations and online servic...TAUS MT Showcase, MT@EC for European public administrations and online servic...
TAUS MT Showcase, MT@EC for European public administrations and online servic...TAUS - The Language Data Network
 
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...TAUS - The Language Data Network
 
Reflecting upon the European Day of Languages
Reflecting upon the European Day of LanguagesReflecting upon the European Day of Languages
Reflecting upon the European Day of LanguagesFederico Gobbo
 
Workflow Development for OCR (and beyond)
Workflow Development for OCR (and beyond)Workflow Development for OCR (and beyond)
Workflow Development for OCR (and beyond)cneudecker
 
META-NET and META-SHARE: An Overview
META-NET and META-SHARE: An OverviewMETA-NET and META-SHARE: An Overview
META-NET and META-SHARE: An OverviewGeorg Rehm
 
Centre of Competence in digitisation. Clemens Neudecker
Centre of Competence in digitisation. Clemens NeudeckerCentre of Competence in digitisation. Clemens Neudecker
Centre of Competence in digitisation. Clemens NeudeckerBiblioteca Nacional de España
 
Olaf Janssen on the collaboration between European national libraries during ...
Olaf Janssen on the collaboration between European national libraries during ...Olaf Janssen on the collaboration between European national libraries during ...
Olaf Janssen on the collaboration between European national libraries during ...Olaf Janssen
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...Europeana
 

Similar a Computational Morphology and the META-NET Strategic Research Agenda for Multilingual Europe 2020 (20)

The Strategic Impact of META-NET on the Regional, National and International ...
The Strategic Impact of META-NET on the Regional, National and International ...The Strategic Impact of META-NET on the Regional, National and International ...
The Strategic Impact of META-NET on the Regional, National and International ...
 
AI and Conference Interpretation – From Smart Assistants for the Human Interp...
AI and Conference Interpretation – From Smart Assistants for the Human Interp...AI and Conference Interpretation – From Smart Assistants for the Human Interp...
AI and Conference Interpretation – From Smart Assistants for the Human Interp...
 
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
TAUS Roundtable Moscow, Is MT Ready for e-Government, The Latvian Story, Indr...
 
AI for Translation Technologies and Multilingual Europe
AI for Translation Technologies and Multilingual EuropeAI for Translation Technologies and Multilingual Europe
AI for Translation Technologies and Multilingual Europe
 
ELSE IF 2019: What’s next for Multilingual Europe?
ELSE IF 2019: What’s next for Multilingual Europe?ELSE IF 2019: What’s next for Multilingual Europe?
ELSE IF 2019: What’s next for Multilingual Europe?
 
Celtic language technologies in the digital age
Celtic language technologies in the digital ageCeltic language technologies in the digital age
Celtic language technologies in the digital age
 
Promoting the Use of Basque via Language Technology
Promoting the Use of Basque via Language TechnologyPromoting the Use of Basque via Language Technology
Promoting the Use of Basque via Language Technology
 
Is MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
Is MT ready for e-Government? The Latvian Story. Indra Samite, TildeIs MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
Is MT ready for e-Government? The Latvian Story. Indra Samite, Tilde
 
Edl
EdlEdl
Edl
 
Why the Baltics are a prime region for driving innovation in language technol...
Why the Baltics are a prime region for driving innovation in language technol...Why the Baltics are a prime region for driving innovation in language technol...
Why the Baltics are a prime region for driving innovation in language technol...
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 2...
 
TAUS MT Showcase, MT@EC for European public administrations and online servic...
TAUS MT Showcase, MT@EC for European public administrations and online servic...TAUS MT Showcase, MT@EC for European public administrations and online servic...
TAUS MT Showcase, MT@EC for European public administrations and online servic...
 
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
TAUS MT Showcace, MT Applications in the EU Public Sector, Adrejs Vasiljevs, ...
 
Reflecting upon the European Day of Languages
Reflecting upon the European Day of LanguagesReflecting upon the European Day of Languages
Reflecting upon the European Day of Languages
 
Workflow Development for OCR (and beyond)
Workflow Development for OCR (and beyond)Workflow Development for OCR (and beyond)
Workflow Development for OCR (and beyond)
 
Introduction to CEF.AT. (Kim Harris from ELRC)
Introduction to CEF.AT. (Kim Harris from ELRC)Introduction to CEF.AT. (Kim Harris from ELRC)
Introduction to CEF.AT. (Kim Harris from ELRC)
 
META-NET and META-SHARE: An Overview
META-NET and META-SHARE: An OverviewMETA-NET and META-SHARE: An Overview
META-NET and META-SHARE: An Overview
 
Centre of Competence in digitisation. Clemens Neudecker
Centre of Competence in digitisation. Clemens NeudeckerCentre of Competence in digitisation. Clemens Neudecker
Centre of Competence in digitisation. Clemens Neudecker
 
Olaf Janssen on the collaboration between European national libraries during ...
Olaf Janssen on the collaboration between European national libraries during ...Olaf Janssen on the collaboration between European national libraries during ...
Olaf Janssen on the collaboration between European national libraries during ...
 
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
Europeana meeting under Finland’s Presidency of the Council of the EU - Day 1...
 

Más de Georg Rehm

QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...Georg Rehm
 
Observations on Annotations – From Computational Linguistics and the World Wi...
Observations on Annotations – From Computational Linguistics and the World Wi...Observations on Annotations – From Computational Linguistics and the World Wi...
Observations on Annotations – From Computational Linguistics and the World Wi...Georg Rehm
 
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...Georg Rehm
 
Künstliche Intelligenz beim Dolmetschen und Übersetzen
Künstliche Intelligenz beim Dolmetschen und ÜbersetzenKünstliche Intelligenz beim Dolmetschen und Übersetzen
Künstliche Intelligenz beim Dolmetschen und ÜbersetzenGeorg Rehm
 
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...Georg Rehm
 
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) ÜberblickKI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) ÜberblickGeorg Rehm
 
Kuratieren im Zeitalter der KI
Kuratieren im Zeitalter der KIKuratieren im Zeitalter der KI
Kuratieren im Zeitalter der KIGeorg Rehm
 
Artificial Intelligence for the Film Industry
Artificial Intelligence for the Film IndustryArtificial Intelligence for the Film Industry
Artificial Intelligence for the Film IndustryGeorg Rehm
 
KI für die Kundenkommunikation
KI für die KundenkommunikationKI für die Kundenkommunikation
KI für die KundenkommunikationGeorg Rehm
 
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...Georg Rehm
 
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen BibliothekenDigitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen BibliothekenGeorg Rehm
 
EPUB, quo vadis? Publishing im W3C
EPUB, quo vadis? Publishing im W3CEPUB, quo vadis? Publishing im W3C
EPUB, quo vadis? Publishing im W3CGeorg Rehm
 
Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...
Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...
Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...Georg Rehm
 
Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...
Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...
Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...Georg Rehm
 
Curation Technologies for Multilingual Europe
Curation Technologies for Multilingual EuropeCuration Technologies for Multilingual Europe
Curation Technologies for Multilingual EuropeGeorg Rehm
 
Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?Georg Rehm
 
Globale Standards im Web of Things
Globale Standards im Web of ThingsGlobale Standards im Web of Things
Globale Standards im Web of ThingsGeorg Rehm
 
W3C/DFKI Automotive Workshop
W3C/DFKI Automotive WorkshopW3C/DFKI Automotive Workshop
W3C/DFKI Automotive WorkshopGeorg Rehm
 

Más de Georg Rehm (18)

QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
QURATOR: A Flexible AI Platform for the Adaptive Analysis and Creative Genera...
 
Observations on Annotations – From Computational Linguistics and the World Wi...
Observations on Annotations – From Computational Linguistics and the World Wi...Observations on Annotations – From Computational Linguistics and the World Wi...
Observations on Annotations – From Computational Linguistics and the World Wi...
 
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
The Preparation, Impact and Future of the META-NET White Paper Series “Europe...
 
Künstliche Intelligenz beim Dolmetschen und Übersetzen
Künstliche Intelligenz beim Dolmetschen und ÜbersetzenKünstliche Intelligenz beim Dolmetschen und Übersetzen
Künstliche Intelligenz beim Dolmetschen und Übersetzen
 
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
Herausforderungen und Lösungen für die europäische Sprachtechnologie- Forschu...
 
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) ÜberblickKI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
KI, Sprachtechnologie und Digital Humanities: Ein (unvollständiger) Überblick
 
Kuratieren im Zeitalter der KI
Kuratieren im Zeitalter der KIKuratieren im Zeitalter der KI
Kuratieren im Zeitalter der KI
 
Artificial Intelligence for the Film Industry
Artificial Intelligence for the Film IndustryArtificial Intelligence for the Film Industry
Artificial Intelligence for the Film Industry
 
KI für die Kundenkommunikation
KI für die KundenkommunikationKI für die Kundenkommunikation
KI für die Kundenkommunikation
 
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
Transformieren, Manipulieren, Kuratieren: Technologien für die Wissensarbeit ...
 
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen BibliothekenDigitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
Digitale Kuratierungstechnologien: Anwendungsfälle in Digitalen Bibliotheken
 
EPUB, quo vadis? Publishing im W3C
EPUB, quo vadis? Publishing im W3CEPUB, quo vadis? Publishing im W3C
EPUB, quo vadis? Publishing im W3C
 
Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...
Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...
Digitale Kuratierungstechnologien für verschiedene Branchen und Anwendungssze...
 
Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...
Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...
Generische Kuratierungstechnologien für spezifische Anwendungsfälle: Hintergr...
 
Curation Technologies for Multilingual Europe
Curation Technologies for Multilingual EuropeCuration Technologies for Multilingual Europe
Curation Technologies for Multilingual Europe
 
Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?Web Annotations – A Game Changer for Language Technology?
Web Annotations – A Game Changer for Language Technology?
 
Globale Standards im Web of Things
Globale Standards im Web of ThingsGlobale Standards im Web of Things
Globale Standards im Web of Things
 
W3C/DFKI Automotive Workshop
W3C/DFKI Automotive WorkshopW3C/DFKI Automotive Workshop
W3C/DFKI Automotive Workshop
 

Último

Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 

Último (20)

Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 

Computational Morphology and the META-NET Strategic Research Agenda for Multilingual Europe 2020

  • 1. The State of Computational Morphology for Europe’s Languages and the META-NET Strategic Research Agenda Georg Rehm Network Manager META-NET DFKI, Berlin, Germany georg.rehm@dfki.de 3rd Int. Workshop on Systems and Frameworks for Computational Morphology (SFCM 2013) Berlin, Germany – September 06, 2013 Co-funded by the 7th Framework Programme and the ICT Policy Support Programme of the European Commission through the contracts T4ME, CESAR, METANET4U, META-NORD (grant agreements no. 249119, 271022, 270893, 270899).
  • 2. Outline q  Introduction q  Language White Paper Series: Europe’s Languages in the Digital Age q  The State of Computational Morphology for Europe’s Languages q  The META-NET Strategic Research Agenda for Multilingual Europe q  Conclusions http://www.meta-net.eu 2
  • 3. Multilingual Europe q  q  q  q  Where were we back in 2010? Challenge: Providing each language community with the most advanced technologies for communication and information so that maintaining their mother tongue does not turn into a disadvantage. While research has made considerable progress in recent years, the pace of progress is not fast enough to meet the challenge within the next 10-20 years. All stakeholders – researchers, LT user and provider industries, language communities, funding programmes, policy makers – should team up in a strategic alliance for a major dedicated push. http://www.meta-net.eu 3
  • 4. Objectives META-NET is a network of excellence dedicated to fostering the technological foundations of the European multilingual information society. http://www.meta-net.eu 4
  • 5. Four EU-Funded Projects q  q  q  q  q  Initial project: T4ME (FP7; 13 partners, 10 countries) Three ICT-PSP consortia since Feb. 2011: CESAR, METANET4U, META-NORD All four projects ended on January 31, 2013. All EU member states and several non-member states covered. META-NET in Sept. 2013: 60 members in 34 countries. http://www.meta-net.eu http://www.meta-net.eu/members 5
  • 6. Europe’s Languages in the Digital Age Language White Paper Series http://www.meta-net.eu 6
  • 7. Language White Paper Series q  q  q  q  q  “Europe’s Languages in the Digital Age”. Reports on the state of our languages in the digital age and the level of support through language technology. Series covers 30 languages. Key communication instruments to address decision makers and journalists. Inform about societal and technological problems and challenges as well as economic opportunities. q  >2 years in the making. q  >200 national experts as contributors. q  >8.000 copies printed and distributed to politicians and journalists. http://www.meta-net.eu 7
  • 8. Language White Paper Series q  Structure: §  §  §  §  §  q  Part 1: Executive Summary Part 2: Languages at Risk — A Challenge for Language Technology Part 3: The [X] Language in the European Information Society Part 4: LT support for [X] Part 5: About META-NET; References, etc. Language White Paper Series (published at Springer): §  Ca. 8.000 printed copies distributed by META-NET. §  Printed copies can be purchased through the usual channels. §  Ebooks available via SpringerLink (fee) and META-NET website (free). §  http://www.meta-net.eu/whitepapers http://www.meta-net.eu 8
  • 9. 30 Languages Covered q  q  q  q  q  q  q  q  q  q  Basque Bulgarian* Catalan Czech* Danish* Dutch* English* Estonian* Finnish* French* q  q  q  q  q  q  q  q  q  q  Galician q  Norwegian German* q  Polish* Greek* q  Portuguese* Hungarian* q  Romanian* Icelandic q  Serbian Irish* q  Slovak* Italian* q  Slovene* Latvian* q  Spanish* Lithuanian* q  Swedish* Maltese* q  Croatian Next up: Welsh * = Official EU language http://www.meta-net.eu 9
  • 10. Cross-Lingual Comparison q  In four application areas, each language is assigned to one of five clusters, ranging from excellent LT support to weak/no support: 1.  Machine Translation 2.  Speech Processing 3.  Text Analytics 4.  Language Resources q  Results finalised at a meeting in Berlin with representatives of all 30 languages (October 21/22, 2011). http://www.meta-net.eu 10
  • 11. Resources Speech Text Analysis MT excellent good moderate fragmentary weak or no support English moderate fragmentary weak or no support Dutch, French, German, Italian, Spanish Basque, Bulgarian, Catalan, Czech, Danish, Finnish, Galician, Greek, Hungarian, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovene, Swedish Croatian, Estonian, Icelandic, Irish, Latvian, Lithuanian, Maltese, Serbian good moderate fragmentary weak or no support Czech, Dutch, Finnish, French, German, Italian, Portuguese, Spanish Basque, Bulgarian, Catalan, Danish, Estonian, Galician, Greek, Hungarian, Irish, Norwegian, Polish, Serbian, Slovak, Slovene, Swedish Croatian, Icelandic, Latvian, Lithuanian, Maltese, Romanian good moderate fragmentary weak/no support English excellent good English excellent Catalan, Dutch, German, Hungarian, Italian, Polish, Romanian English excellent French, Spanish Basque, Bulgarian, Croatian, Czech, Danish, Estonian, Finnish, Galician, Greek, Icelandic, Irish, Latvian, Lithuanian, Maltese, Norwegian, Portuguese, Serbian, Slovak, Slovene, Swedish Czech, Dutch, French, German, Hungarian, Italian, Polish, Spanish, Swedish Basque, Bulgarian, Catalan, Croatian, Danish, Estonian, Finnish, Galician, Greek, Norwegian, Portuguese, Romanian, Serbian, Slovak, Slovene Icelandic, Irish, Latvian, Lithuanian, Maltese http://www.meta-net.eu 11
  • 12. Europe’s Languages and LT English good support through Language Technology http://www.meta-net.eu Dutch French German Italian Spanish Catalan Czech Finnish Hungarian Polish Portuguese Swedish Basque Bulgarian Danish Galician Greek Norwegian Romanian Slovak Slovene Croatian Estonian Icelandic Irish Latvian Lithuanian Maltese Serbian weak or no support 12
  • 13. 450   400   350   300   Languages treated in the 2010 editions of Not  enough  R&I  on  European  languages  of Computational Linguistics and Journal Conferences of ACL, EMNLP and COLING. Many European languages with no reference at all: Slovak, Maltese, Lithuanian, Irish, ➔  LT  research  on  European  languages,  except  for  English,    is  too  weak  and   Albanian, Croatian, Galician etc. too  slow       250   ➔  Many  languages  are  badly  covered   200   150   100   0   English   Chinese   German,  Standard   French   Spanish   Japanese   Arabic   Dutch   Portuguese   Czech   Danish   Swedish   Hindi   Korean   Turkish   Italian   Russian   Finnish   Hebrew   Hungarian   Slovene   Urdu   Romanian   Zulu   Bulgarian   Catalan-­‐Valencian-­‐Balear   Greek   Thai   Welsh   Estonian   Basque   German,  Swiss   InukStut   Indonesian   Ineseño   LaSn   Marathi   Malay   Pushto   Serbian   Syriac   Tamil   UgariSc   Ukrainian   Uspanteko   Vietnamese   50  
  • 14. Key Observations q  When it comes to Language Technology support, there are massive differences between Europe’s languages and technology areas. q  LT support for English is ahead of any other language. q  Even support for English is far from being perfect. q  The gap between English and the other languages keeps widening! q  q  Several languages – Icelandic, Latvian, Lithuanian, Maltese – receive this weakest score in all four areas! At least 21 European languages in danger of digital extinction! (Languages put into the “weak or no support” category at least once.) http://www.meta-net.eu 14
  • 15. White Paper Box Sets (100 copies) http://www.meta-net.eu 15
  • 17. Press Campaign q  Headline of press release: At Least 21 European Languages in Danger of Digital Extinction. q  Sent out to journalists, politicians and other stakeholder groups on the European Day of Languages (Sept. 26, 2012). q  Overwhelmed by the huge interest in the topic and our key findings! q  600+ mentions in the press. q  50+ broadcast interviews with META-NET representatives (ca. 30 radio interviews, ca. 25 television reports). q  News came in from 40+ countries in 35+ different languages. q  Whole of Europe covered. q  Two Parliamentary Questions in the European Parliament on the “digital extinction of languages” topic. http://www.meta-net.eu 17
  • 18. Coverage by Country Basque Country, Austria, 0.20% 0.40% Costa Rica, 0.20% Finland, 0.70% Portugal, 0.40% Canada, 0.20% Brazil, 0.40% Sweden, 0.70% New Zealand, 0.20% Spain Bulgaria International Latvia Mexico, Slovakia, Belgium, 0.90% Netherlands Greece 0.40% 0.40% Bosnia and Herzegovina, UK, 1.10% Romania Cyprus, 0.20% Norway, 0.40% Ireland, 1.30% 0.20% Serbia Australia, 0.20% Italy Lithuania, 1.30% Poland, Germany Russia 0.70% Hungary, 0.20% Estonia Denmark, Latin America, 1.30% France Slovenia 1.30% USA, 1.50% Iceland Malta Malta, 2% Spain, 15.90% Iceland, 2.20% USA Denmark Slovenia, 2.40% Latin America Lithuania Bulgaria, Ireland France, 2.60% 10.80% UK Belgium Estonia, 2.90% Finland Sweden Russia, 3.50% Poland International, 7.90% Norway Mexico Germany, 3.50% Brazil Slovakia Italy, 4.20% Basque Country Latvia, 5.30% Portugal Serbia, 4.40% Austria New Zealand Netherlands, 4.80% Hungary Bosnia and Herzegovina Greece, 4.60% Costa Rica Romania, 4.40% Cyprus Canada Australia http://www.meta-net.eu 18
  • 19. Response: Examples q  q  q  q  q  q  q  q  q  q  q  q  q  q  Austria: Denmark: Finland: Germany: Greece: Hungary: Iceland: Italy: Norway: Slovenia: Serbia: Spain: UK: USA: http://www.meta-net.eu Der Standard. Politiken, Berlingske Tidende. Tiede. Heise Newsticker, Süddeutsche Zeitung. in.gr, Πρώτο Θέµα, Prosilipsis. Origo. Fréttablaðið, Morgunblaðið. Wired. Computerworld. Delo, Dnevnik, Demokracija. Politika. El Mundo. Huffington Post. Mashable, NBC News, Reddit. 19
  • 20. pakker. »Pengene betyder, at der kommer bedre forhold for kræftpatienter. Det er vigtigt, at folk får mulighed for at blive behandlet hurtigt, så de ikke skal gå rundt og være bekymrede,« siger formand for kvalitetsudvalget i Region Hovedstaden, Kirsten Lee (R). rende niveau. I Kræftens Bekæmpelse hilser direktør Leif Vestergaard Pedersen det velkomment, at Region Hovedstaden nu bruger 32 mio. kr. til at udvide kapaciteten. »Det har vist sig, at der er et forbedringspotentiale på dette område, og derfor er det godt, at man prioriterer det. Flere og flere får kræft, og flere og flere overlever. Det betyder, at kapaciteten gradvist skal øges hele tiden. Servicemål er et godt initiativ, og et mål på 90-95 pct. er nok det realistiske, selv om udgangspunktet bør være 100 procent,« siger Leif Vestergaard Pedersen og tilføjer: »Men så er det også vigtigt at holde fast i det mål og ikke stille sig tilfreds med, at 80 eller 85 pct. kommer igennem til tiden.« B 76 Sådan læses grafikken: Procentdel uden for servicemål Procentdel inden for servicemål Press Campaign: Highlights Flere får kræft – og flere overlever Konkret er hensigten at udvide den onkologiske kapacitet – det vil sige stråle- og kemobehandlingen – på såvel Rigshospitalet, Herlev Hospital, Hillerød Hospital og Bornholms Hospital. Desuden sættes der penge af til at øge antallet af operationer og udvide ambulatoriekapaciteten på det urologiske område på Herlev, Positiv udvikling Negativ udvikling INFOGRAFIK: HENRIK KIÆR / TEKST: FLEMMING STEEN PEDERSEN KILDE: REGION HOVEDSTADEN Ord. Forskere arbejder på at forbedre danske oversættelser på internettet. Dårlig sprogteknologi truer dansk på nettet Af Jens Ejsing // ejs@berlingske.dk Det danske sprog har det svært i den digitale verden. Det konstaterer danske sprogforskere- og eksperter i forbindelse med den nye internationale undersøgelse META-NET, der ser nærmere på, hvordan en lang række mindre, europæiske sprog som dansk klarer sig i den digitale verden. Forskerne fra bl.a. Københavns Universitet og Dansk Sprognævn når frem til, at dansk i fremtiden kan få det endnu sværere i den digitale verden, fordi Google Translate, GPSer, applikationer til smartphones og andre sprogteknologiske programmer ikke i tilstrækkelig grad formår at behandle de mange nuancer i det danske sprog. Professor i sprogteknologi på Københavns Universitet, Bolette Sandford Pedersen, mener, at der er brug for en slags digital dansk sprogbank fyldt med data, så bl.a. oversættelser bliver så præcise og gode som muligt. Med http://www.meta-net.eu hjælp fra sprogbanken kan forskere ifølge professoren hjælpe virksomheder med at forbedre programmer, der skal håndtere sproglig viden om bl.a. maskinoversættelse, talegenkendelse og informationssøgning. Dermed vil der blive længere mellem fejlagtige oversættelser, som når »hæld olie på panden« med Google Translate bliver til »pour oil on the forehead« på engelsk. Oversættelser, der er i værste fald er så upræcise, at danskere ender med at fravælge deres eget sprog i den digitale verden. Sproghjælp til virksomheder Hun anerkender dog, at »teknologien til automatiske oversættelser på mange måder er fantastisk«. »Den er bare ikke god nok, når det gælder dansk,« siger hun: »Det er som om, at vi i et vist omfang lægger det i hænderne på Google eller andre virksomheder at afgøre, om dansk skal behandles godt nok eller ej. Men det danske marked er ikke stort for dem. Spørgsmålet er derfor, fakta H Sprog i Europa H Der er omkring 80 sprog i EU. For 21 af dem – også dansk – gælder det, at der er store sprogteknologiske mangler, når det gælder bl.a. maskinoversættelse, talegenkendelse og informationssøgning. H Ifølge en EU-undersøgelse køber et stigende antal europæiske internetbrugere varer eller tjenester på nettet, hvor det sprog, der bliver anvendt, ikke er deres eget. Det gælder over halvdelen af brugerne. H Over hver tredje anvender et fremmedsprog til at skrive mail eller indlæg på nettet. om vi ikke i højere grad selv skal gøre noget for at sikre, at det fornødne datamateriale er til rådighed, så vi får gode oversættelser og anden god sprogteknologi. Det kunne f.eks. være ved, at vi gjorde en indsats for at få oprettet en sprogbank med en masse beriget materiale om dansk.« »Hvis vi hele tiden oplever, at oversættelser er behæftede med fejl, tør vi ikke stole på dem,« siger hun og understreger, at »fejlagtige oversættelser kan føre til store misforståelser«. Ifølge Dansk Sprognævns direktør, Sabine Kirchmeier-Andersen, kan dårlig sprogteknologi have konsekvenser for mange danskere, der ikke er så gode til engelsk. »Hvis vi har ambitioner om at bruge det danske sprog i fremtidens teknologiske univers, skal der gøres en indsats nu for at fastholde ekspertise og udbygge den viden, vi har,« mener hun: »Ellers risikerer vi, at kun folk, der taler flydende engelsk, vil få glæde af de nye generationer af web-, tele- og robotteknologi, der er på vej.« B 20
  • 21. Press Campaign: Highlights 38 Πέµπτη 27 Σεπτεµβρίου 2012 ΕΛΕΥΘΕΡΟΣ ΤΥΠΟΣ Life Date 30 September 2012 Page 16 Γιώργος Μπαµπινιώτης. GREEKLISH Η γλώσσα της αποξένωσης… ΠΟΛΛΕΣ ΕΥΡΩΠΑΪΚΕΣ ΓΛΩΣΣΕΣ ΘΕΩΡΟΥΝΤΑΙ ΤΕΧΝΟΛΟΓΙΚΑ… ΞΕΠΕΡΑΣΜΕΝΕΣ Με ψηφιακή εξαφάνιση κινδυνεύουν τα ελληνικά Σ την ψηφιακή εποχή δεν… µιλούν ελληνικά, όπως και αρκετές άλλες ευρωπαϊκές γλώσσες, σύµφωνα µε πανευρωπαϊκή έκθεση µε την υπογραφή 200 και πλέον ειδικών. Η συγκεκριµένη µελέτη δηµοσιεύτηκε από το επιστηµονικό δίκτυο ΜΕΤΑ-ΝΕΤ µε αφορµή τη χτεσινή Ευρωπαϊκή Ηµέρα Γλωσσών. Για τις ανάγκες της έρευνάς τους, γλωσσολόγοι από 34 χώρες της Γηραιάς Ηπείρου βαθµολόγησαν τις διαθέσιµες γλωσσικές υπηρεσίες και δηµιούργησαν ένα «Λευκό Βιβλίο» για κάθε ευρωπαϊκή γλώσσα. Στη µελέτη τους, οι ειδικοί αναζήτησαν µεταξύ άλλων τέσσερα βασικά ηλεκτρονικά εργαλεία, δηλαδή την ύπαρξη αυτόµατης µετάφρασης, τη δυνατότητα φωνητικής αλληλεπίδρασης και ψηφιακής ανάλυσης κειµένου, ενώ ταυτόχρονα διερευνήθηκε και η διαθεσιµότητα γλωσσικών πόρων ή πηγών. Σε πρώτη φάση εξέτασαν τις ιστοσελίδες που επιτρέπουν στους χρήστες να κάνουν µεταφράσεις online, όπως, για παράδειγµα, η υπηρεσία του κολοσσού πληροφορικής Google Translate. Την ίδια ώρα, εξετάστηκε και η «επικοινωνία» των ελληνόφωνων χρηστών µε τις…συσκευές τους, όπως για παράδειγµα η δυνατότητα ΕΛΕΝΗ ΒΕΡΓΟΥ evergou@e-typos.com να «µιλήσει» κάποιος στο GPS στη µητρική του γλώσσα. Οι ερευνητές κατέληξαν στο συµπέρασµα ότι υπάρχουν τέτοιες συσκευές, αλλά δεν είναι τόσο διαδεδοµένες όσο οι αγγλόφωνες. Το «χρυσό» µετάλλιο κατακτά, όπως είναι άλλωστε και λογικό, η αγγλική γλώσσα. Οι αγγλόφωνοι χρήστες έχουν την καλύτερη δυνατή τεχνολογική υποστήριξη, κάτι το οποίο ευνοεί την περαιτέρω εξάπλωση της γλώσσας. Από «τεχνολογικό αποκλεισµό» κινδυνεύουν περισσότερο η ισλανδική, η λετονική, η λιθουανική και η µαλτέζικη γλώσσα, ενώ σε λίγο καλύτερη µοίρα βρίσκονται η ελληνική, η βουλγαρική, η ουγγρική και η πολωνική, που όπως αναφέρει η έρευνα έχουν «αποσπασµατική» τεχνολογική υποστήριξη. «Μέτρια» χαρακτηρίζεται η υποστήριξη χρηστών σε ολλανδική, γαλλική, γερµανική, ιταλική και ισπανική γλώσσα. Οι επικεφαλής της επιστηµονικής οµάδας, Χανς Ουζκοράιτ και Γκεόργκ Ρεµ, αναφέρουν χαρακτηριστικά: «Υπάρχουν δραµατικές διαφορές στην υποστήριξη της γλωσσικής http://www.meta-net.eu τεχνολογίας ανάµεσα στις διάφορες ευρωπαϊκές γλώσσες. Το χάσµα µεταξύ “µικρών” και “µεγάλων” γλωσσών ολοένα και διευρύνεται. Πρέπει να εξασφαλίσουµε τον εφοδιασµό των µικρότερων και λιγότερο πλούσιων σε ψηφιακούς πόρους γλωσσών µε τις απαραίτητες βασικές τεχνολογίες. ∆ιαφορετικά, οι γλώσσες αυτές είναι καταδικασµένες σε ψηφιακή εξαφάνιση». Μάλιστα, οι ειδικοί τονίζουν ότι χωρίς αποφασιστική δράση οι γλώσσες αυτές δύσκολα θα… επιβιώσουν στον ψηφιακό κόσµου του 21ου αιώνα. Η κ. Μαρία Γαβριηλίδου, µέλος της επιστηµονικής οµάδας από το Ινστιτούτο Οι αγγλόφωνοι χρήστες έχουν την καλύτερη δυνατή τεχνολογική υποστήριξη, γεγονός που ευνοεί την περαιτέρω εξάπλωση της γλώσσας Επεξεργασίας του Λόγου Ερευνητικό Κέντρο Αθηνά, λέει στον «Ε.Τ.»: «Η έρευνα αυτή δεν λέει ότι δεν θα ζήσει η ελληνική γλώσσα ή ότι κινδυνεύει µε εξαφάνιση». Η ειδικός εξηγεί ότι όσο υπάρχουν άνθρωποι που µιλάνε, γράφουν και επικοινωνούν µε µια γλώσσα, τότε αυτή θα συνεχίσει να υπάρχει. Είναι σηµαντικό, όµως, να έχουν όλοι οι χρήστες τη δυνατότητα να «µιλήσουν» στις µηχανές, όπως τα GPS τους, στα ελληνικά και να έχουν στη διάθεσή τους γλωσσικά εργαλεία ηλεκτρονικών υπολογιστών. Μεταξύ αυτών των «εργαλείων» είναι οι διορθωτές ορθογραφικών και συντακτικών λαθών, που χρησιµοποιούνται καθηµερινά από εκατοντάδες Ελληνες χρήστες και βασίζονται στη γλωσσική τεχνολογία. Παρ’ όλα αυτά, τονίζει ότι η ψηφιακή εξάπλωση µιας γλώσσας είναι σηµαντική «∆εν είναι στα χέρια του µέσου χρήστη. Οι εκάστοτε κυβερνήσεις, η Ευρωπαϊκή Ενωση και ο ιδιωτικός τοµέας πρέπει να χρηµατοδοτήσουν την ανάπτυξη αυτής της τεχνολογίας για όλες τις γλώσσες», αναφέρει και συνεχίζει: «Οι χρήστες, όµως, πρέπει να απαιτούν να υπάρχουν και στη γλώσσα τους τα µέσα αυτά και να µην ικανοποιούνται µε τα αγγλικά». ■ ΜΕ GREEKLISH επικοινωνούν πλέον µέσω µηνυµάτων ή email οι περισσότεροι νέοι της χώρας µας. Παρά το γεγονός ότι τα τελευταία χρόνια υπάρχουν τα γλωσσικά εργαλεία, τα οποία επιτρέπουν τη χρήση της ελληνικής γραµµατοσειράς, έφηβοι και νέοι ενήλικες φαίνεται ότι δεν έχουν «αγκαλιάσει» αυτές τις τεχνολογίες. Ο καθηγητής Γλωσσολογίας, κ. Γιώργος Μπαµπινιώτης, λέει στον «Ε.Τ.»: «Τα greeklish είναι πρόβληµα για την ελληνική γλώσσα, ιδίως για ανθρώπους νέας ηλικίας για έναν καθαρά γλωσσικό λόγο. Με τη χρήση των greeklish αποξενώνονται από τη µορφή της λέξης ή όπως λέµε το ετυµολογικό ίνδαλµα που δηλώνεται µε την ορθογραφία της λέξης και συνδέεται και µε τη σηµασία της λέξης και µε την προέλευσή της». Ο κίνδυνος, µε τον οποίο έρχονται αντιµέτωποι οι νέοι άνθρωποι, είναι η αποξένωση από τη γραπτή µορφή της γλώσσας. Αυτή η «οικειότητα», όµως, βοηθάει και στην κατανόηση της σηµασίας αλλά και την προέλευση της λέξης. «Αυτή η αποξένωση δεν είναι άνευ σηµασίας», αναφέρει ο ειδικός, ο οποίος εξηγεί ότι η διαδικασία της γραφής βοηθάει να εντυπωθεί η λέξη και να συνδεθεί µε άλλες οµόρριζες λέξεις. «Οταν χρησιµοποιείται αυτή η µορφή επικοινωνίας, καταστρέφονται, ατονούν. ∆εν είναι προς θάνατο, αλλά θα κάνει ζηµιά», αναφέρει ο κ. Μπαµπινιώτης, ο οποίος συµβουλεύει τους χρήστες να επιλέγουν την ελληνική γραµµατοσειρά. Copyright material. This may only be copied under the terms of a Newspaper Licensing Agency agreement (www.nla.co.uk) or with written publisher permission. For external republishing rights see www.nla-republishing.com 21
  • 22. Press Campaign: Highlights 049-ΚΟΣΜΟΣ 29/09/2012 1:41 ? Μ Page 49 49 KYPIAKH 30 ΣΕΠΤΕΜΒΡΙΟΥ 2012 Οι περισσότερες ευρωπαϊκές γλώσσες κινδυνεύουν µε ψηφιακή εξαφάνιση Τη γλώσσα µού... έχασαν Πρέπει να εξασφαλιστεί ο εφοδιασµός των µικρότερων και λιγότερο πλούσιων -σε ψηφιακούς πόρους- γλωσσών µε τις απαραίτητες βασικές τεχνολογίες Η 26η Σεπτεµβρίου έχει καθιερωθεί από το Συµβούλιο της Ευρώπης ως η Ευρωπαϊκή Ηµέρα των Γλωσσών, αλλά, σύµφωνα µε µια νέα ευρωπαϊκή επιστηµονική έκθεση, οι 21 από τις 30 γλώσσες της Ευρώπης -µεταξύ των οποίων και η Ελληνική- αντιµετωπίζουν κίνδυνο ψηφιακής εξαφάνισης. Η έρευνα κρούει τον κώδωνα κινδύνου, καθώς διαπίστωσε ότι η ψηφιακή βοήθεια για τις περισσότερες ευρωπαϊκές γλώσσες είναι ελλιπής ή απολύτως ανύπαρκτη για τους χρήστες. Τις έφαγαν οι κοινές Η έκθεση, µε τη µορφή µιας σειράς Λευκών Βίβλων (µε τίτλο «Γλώσσες στην Ευρωπαϊκή Κοινωνία της Πληροφορίας»), από το επιστηµονικό δίκτυο ΜΕΤΑΝΕΤ, το οποίο συνενώνει 60 ερευνητικά κέντρα σε 34 χώρες, επισηµαίνει ότι οι γλώσσες που µιλιούνται από σχετικά µικρό αριθµό ανθρώπων κινδυνεύουν, επειδή δεν έχουν τεχνολογική υποστήριξη όπως έχουν οι ευρέως χρησιµοποιούµενες γλώσσες. Λευκές Βίβλοι έχουν καταρτιστεί για τις εξής ευρωπαϊκές γλώσσες: αγγλικά, βασκικά, βουλγαρικά, γαλικιανά, γαλλικά, γερµανικά, δανικά, ελληνικά, εσθονικά, ιρλανδικά, ισλανδικά, ισπανικά, ιταλικά, καταλανικά, κροατικά, λετονικά, λιθουανικά, µαλτέζικα, νορβηγικά (µπουκµόλ και νινόρσκ), ολλανδικά, ουγγρικά, πολωνικά, πορτογαλικά, ρουµανικά, σερβικά, σλοβακικά, σλοβενικά, σουηδικά, τσεχικά και φινλανδικά. Κάθε Λευκή Βίβλος είναι γραµµένη στη γλώσσα στην οποία αναφέρεται και είναι µεταφρασµένη στα αγγλικά. Τέσσερις µεγάλοι κίνδυνοι Σύµφωνα µε τη νέα µελέτη, η Ισλανδική, η Λετονική, η Λιθουανική και η Μαλτέζικη αντιµετωπίζουν τον µεγαλύτερο κίνδυνο εξαφάνισης σε µια ευρωπαϊκή τεχνολογική κοινωνία, που ολοένα περισσότερο προωθεί τη χρήση συγκεκριµένων γλωσσών και ιδίως της Αγγλικής. Όµως και άλλες γλώσσες, όπως η Ελληνική, η Βουλγαρική, η Ουγγρική και η Πολωνική, επίσης κινδυνεύουν στον σύγχρονο ψηφιακό κόσµο. Η έρευνα του ΜΕΤΑ-ΝΕΤ, στην οποία συνέβαλαν περισσότεροι από 200 ειδικοί, αξιολογεί τον κίνδυνο για κάθε γλώσσα µε βάση τέσσερα βασικά κριτήρια σε τεχνολογικό/ψηφιακό επίπεδο: την ύπαρξη αυτόµατης µετάφρασης στη συγκεκριµένη γλώσσα, τη δυνατότητα φωνητικής αλληλεπίδρασης, τη δυνατότητα ψηφιακής ανάλυσης κειµένου και τη διαθεσιµότητα των σχετικών ψηφιακών γλωσσικών πόρων/πηγών. Οι δυνατές Η γλώσσα µε την καλύτερη βαθµολογία στα κριτήρια είναι ασφαλώς η Αγγλική, που απολαµβάνει τη συγκριτικά καλύτερη τεχνολογική υποστήριξη (αν και όχι την καλύτερη δυνατή), γεγονός που διευκολύνει την περαιτέρω εξάπλωσή της. http://www.meta-net.eu Ακολουθούν µε ικανοποιητική ή µέτρια τεχνολογική/ψηφιακή υποστήριξη η Ολλανδική, η Γαλλική, η Γερµανική, η Ιταλική και η Ισπανική. Η Ελληνική, όπως επίσης η Βασκική, η Καταλανική, η Πολωνική, η Ουγγρική κ.ά. κατατάσσονται στις γλώσσες µε «αποσπασµατική» µόνο υποστήριξη, γι’ αυτό ακριβώς θεωρούνται γλώσσες υψηλού κινδύνου προς εξαφάνιση. Δραµατικές διαφορές Σύµφωνα µε τους επιµελητές της µελέτης Χανς Ουζκοράιτ και Γκέοργκ Ρεµ, «υπάρχουν δραµατικές διαφορές στην υποστήριξη της γλωσσικής τεχνολογίας ανάµεσα στις διάφορες ευρωπαϊκές γλώσσες και τεχνολογικές περιοχές. Το χάσµα µεταξύ ‘µικρών’ και ‘µεγάλων’ γλωσσών ολοένα και διευρύνεται. Πρέπει να εξασφαλίσουµε τον εφοδιασµό των µικρότερων και λιγότερο πλούσιων -σε ψηφιακούς πόρους- γλωσσών µε τις απαραίτητες βασικές τεχνολογίες, αλλιώς οι γλώσσες αυτές είναι καταδικασµένες σε ψηφιακή εξαφάνιση». Ως ελπίδα αυτών των γλωσσών θεωρείται η βελτίωση και η ευρύτερη αξιοποίηση του λογισµικού γλωσσικής τεχνολογίας, το οποίο επιτρέπει τη φωνητική και τη γραπτή επεξεργασία των διαφόρων γλωσσών. Παραδείγµατα αυτών των δυνατοτήτων είναι οι ηλεκτρονικοί ορθογραφικοί και συντακτικοί διορθωτές κειµένων, οι διαδραστικοί προσωπικοί «βοηθοί» των έξυπνων κινητών τηλεφώνων (π.χ. η Siri στο iPhone), τα συστήµατα αυτόµατης µετάφρασης, τα ηλεκτρονικά συστήµατα διαλόγου των τηλεφωνικών κέντρων, οι µηχανές αναζήτησης, η συνθετική φωνή στα συστήµατα πλοήγησης των αυτοκινήτων. κ.ά. Το βασικό πρόβληµα Το σηµαντικό, σύµφωνα µε την έκθεση, είναι όλες αυτές οι δυνατότητες να προσφέρονται στους χρήστες και στη µητρική τους γλώσσα που κινδυνεύει µε εξαφάνιση. Χωρίς αποφασιστική δράση, γίνεται η δυσοίωνη πρόβλεψη ότι οι γλώσσες αυτές δύσκολα θα επιβιώσουν στον ψηφιακό κόσµο του 21ου αιώνα. Ένα πρόβληµα είναι ότι το λογισµικό αυτών των συστηµάτων γλωσσικής τεχνολογίας στηρίζεται σε στατιστικές µεθόδους που απαιτούν τεράστιες ποσότητες γραπτών ή φωνητικών δεδοµένων, όµως τόσα πολλά δεδοµένα είναι δύσκολο να αποκτηθούν για γλώσσες που οµιλούνται από σχετικά λίγους ανθρώπους. Εξάλλου, ακόµα και για ευρέως χρησιµοποιούµενες γλώσσες όπως τα αγγλικά, η σχετική γλωσσική τεχνολογία έχει ακόµα αδυναµίες, που είναι π.χ. φανερές στις άκρως ανεπαρκείς και γεµάτες λάθη αυτόµατες µεταφράσεις. Η έκθεση προτείνει ότι πρέπει να αναληφθεί µια συντονισµένη µεγάλης κλίµακας προσπάθεια στην Ευρώπη, προκειµένου σταδιακά να δηµιουργηθούν ή να βελτιωθούν οι αναγκαίες τεχνολογίες και να βοηθηθούν οι γλώσσες που είναι ψηφιακά παραγκωνισµένες. 22
  • 28. Website: Visitors Overview began sending European Day out press release of Languages unusually high traffic http://www.meta-net.eu 28
  • 29. Website: Visitors’ Cities City with the most visits: Brussels! http://www.meta-net.eu 29
  • 30. The State of Computational Morphology for Europe’s Languages Computational Morphology for Europe’s Languages http://www.meta-net.eu 30
  • 31. Computational Morphology? q  So, what is the state of Computational Morphology support? Do we have precise, good, reliable tools for all European languages? q  Answering this question is a non-trivial, difficult and complex task. q  However, we can provide a rough approximation. q  q  In META-NET we had a look at 30 languages (Basque, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hungarian, Icelandic, Irish, Italian, Latvian, Lithuanian, Maltese, Norwegian, Polish, Portuguese, Romanian, Serbian, Slovak, Slovene, Spanish, Swedish). We gathered data on several aspects that were used to prepare a cross-language comparison, along with statistics, discussions, comparisons, experts’ opinions, etc. http://www.meta-net.eu 31
  • 32. Coarse-Grained View q  We investigated four main areas: Machine Translation; Speech; Text Analytics; Language Resources. q  Computational Morphology is covered by Text Analytics. q  Text Analytics comprises, among others, §  the quality and coverage of existing text analytics technologies (morphology, syntax, semantics), §  coverage of linguistic phenomena and domains, §  amount and variety of available applications, §  quality and coverage of existing lexical resources and grammars. http://www.meta-net.eu 32
  • 34. Key Observations q  When it comes to Language Technology support, there are massive differences between Europe’s languages and technology areas. q  LT support for English is ahead of any other language. q  Even support for English is far from being perfect. q  The gap between English and the other languages keeps widening! q  Several languages – Icelandic, Latvian, Lithuanian, Maltese – receive this weakest score in all four areas! http://www.meta-net.eu 34
  • 35. Simplified Methodology q  Distributed data collection process in the respective countries. q  30 tables provide data for all languages (tools, resources, gaps etc.). q  Reduce numbers to one final score per language and area. q  Calibration of tables across languages in smaller groups. q  Final scores for each area and language were derived from two central features (quality, coverage), resulting in one big table: Basque Bulgarian Catalan Croatian Czech Danish Dutch English Estonian Finnish French Galician German Greek Hungarian Icelandic Irish Italian Latvian Lithuanian Maltese Norwegian Polish Portuguese Romanian Serbian Slovak Slovene Spanish Swedish Language Technology (Tools, Technologies, Applications) Tokenization, Morphology (tokenization, POS tagging, morphological analysis/generation)analysis) Parsing (shallow or deep syntactic Sentence Semantics (WSD, argument structure, semantic roles) Text Semantics(coreferenceresolution, context, pragmatics, inference) Discourse Processing (text structure, coherence, Advanced rhetorical structure/RST, argumentative zoning, argumentation, Information Retrieval(text indexing, multimedia IR, crosslingual IR) Information Extraction (named entity recognition, event/relation extraction, opinion/sentiment report generation, Language Generation (sentence generation,recognition, text text generation) Summarization, Question Answering,advanced Information Access Technologies Machine Translation Speech Recognition Speech Synthesis Dialogue Management (dialogue capabilities and user modelling) 5 4 3,1 1 1 4 3 0 2 3,1 1 2,4 0 5 4 2,1 2 0 2 3 2 2 2 3 3 0 5 3 2 1,1 2 1,2 1,1 1,2 0 3,1 3 4 2,2 5 2 1,2 0 0 2,3 3,1 0,4 0,1 1,2 3 3,1 1 0 5 3,1 3 3 0 4,1 4 3 0 2,1 4 3,1 5 3,1 1,1 1 1 3 3 0 2,1 1,2 1,2 2,1 1 3,1 2,1 2,1 2 0 3 2,1 2,1 2,1 2,2 3,1 4 2,1 4,1 4,1 3,1 1,1 2 4,1 3,1 2 2 2,1 4 4,1 3,1 5 3,1 2 2 0 3 2 0 2 2,1 4 4 3 4 3,1 2 1 0 3 2 2,2 2 3 3 4 1,1 4 4 1,1 2,1 2 4,1 3,1 2 3 3,1 4 4 3 4,1 4,1 2,1 2,1 0 2 1,2 0 1,1 4,1 5 5 1 5 3 1,1 2,1 2,1 3 3 2 2 2,1 4 4,1 3,1 4 2,1 2 2 1 3,1 3 1,1 1,1 1 3,1 4,1 1,2 4,1 4 1,2 0,2 0 1,1 6 0 0 5 2,2 4 0 4,1 4 1,1 0 0 0 1 0 0 2 1,1 2,1 0 4,1 2 0 0 0 3,1 0 3 0 2,1 3,1 3,1 0 3,1 3,1 4 3 2 4,1 4,1 0 3 3,1 4,1 4 3 4,1 2,1 0 0 0 0 3 1,2 0 4 0 3,1 0 3 1,1 1,1 1 1 1,2 3 0 0,1 3 1,1 3 0 3,1 0 0 0 0 0 0 0 0 2,1 1 4 0 4,1 3,1 3,1 3 3 4 4 3,1 3,1 2,2 1,1 2,1 1,1 5 4 1,3 1,2 1 2 2 1 2 3 3,1 5,1 1 4,1 3,1 3,1 1,2 2 0 3,1 0 2,2 2,1 2,2 4 3 5 4 4 4,1 3,1 5 4,1 0 4,1 3,1 2,1 2 0 5 3,2 0 0 0 3 2 0 0,1 0,1 1 4 0 3,1 0 0 0 0 2,1 1 0 1 2 2 3 0 4,1 3,1 2,2 0 0 0 2,1 0 1,1 3,1 2,1 3,2 2,1 5 4 2,1 2 1 2 1,1 2 2,1 4,1 3,1 4 2 4,1 4,1 2 2,1 1 3,1 4 2,1 1 2,2 3,1 3 3 2,3 2,2 1 0 0 2,2 5 2 5,1 3,1 4 2 4,1 2,1 4,1 2 2,2 2,1 1 2 3,1 3 4,1 3 3,1 3 1 2 2,1 3,1 2 2,1 3,1 2 2,2 2,1 3,1 3,1 0 0 3 3 3,1 0 3,1 0 3,1 0 5 3,3 3,1 2,1 3,1 2,2 2,2 4 3,1 2,1 3,1 2,1 3,1 1,3 1,2 1,3 2,1 1,2 1,2 3 4 1,3 3 1,1 2,2 2,2 1,2 0 2,1 4,1 1,3 2,1 3,1 2,1 2,1 0 4,1 4,2 3 3 4 5,1 1,1 5 4,1 3 4,1 4 4 2,1 2 2,1 2,1 3,1 1 3 5 4 3,1 0 3,1 3,2 0 2,1 3 2,1 2,1 2 4 4 3,1 2,1 3,1 3 1,1 2 3,1 3,1 1,2 3 3,1 3 1,1 1,1 5 2 1 0 5 4,1 2,2 4,1 4,1 2 4 1 3,1 3 1,1 2 2 2,1 1,2 3 3,1 3 2,1 2,1 3 3,1 2,1 0 2 2,1 2,1 2,1 3 1 1,1 2 6 5,1 1,5 0 6 2,2 1 3,1 6 5,1 3,3 1 3,1 2,2 0 0 1,1 2 1 3 3 3 3 0 3,2 1,2 0 0 3,2 2,2 1,1 0 4 3 3,1 0 3 3 4 2,2 3,1 2,1 3,1 0 4,1 3 3,1 3,1 4,1 1 1 0 3,1 1 0 3,1 5 3,1 2,1 1 4 1 0 0 3,1 2 1 3,1 3,1 0 1 1,1 3 0 0 0 2,1 2,1 0 3 2,1 0 0 0 3 3,1 2,1 1,1 4,1 3,2 4,1 1 5 3,2 0 0 4 4 2,2 1,1 4 3 1 1 4 4 4 2,2 4,1 4 3,1 2 2,1 4 0 0 4,1 2,3 2,2 2 1,1 4,1 2,1 2,1 4,1 2,2 0 4 4,1 2,1 4 2 2,2 0 0 0 2,1 4 1,1 2,1 4 0,1 2,1 0,1 4,1 2 0 1,1 2 2 2,1 1,2 3,1 2,1 1,1 0 4,1 3,2 1,4 0 2,2 3,1 0 2,2 2,2 2,1 3 0 3,1 2 2 3 3,1 2,1 2 2 3 3 3 2 3,1 3 1 1 3,2 3 1 4 4,1 3 4,1 1 Language Resources (Resources, Data, Knowledge Bases) Reference Corpora Syntax-Corpora(treebanks, dependency banks) Semantics-Corpora Discourse-Corpora Parallel Corpora, Translation Memories Speech-Corpora (raw speech data, labelled/annotated speech data, speech dialogue data) Multimedia and multimodal data Language Models Lexicons, Terminologies Grammars Thesauri, WordNets Ontological Resources for World Knowledge (e.g. upper models, Linked Data) http://www.meta-net.eu 35
  • 36. Simplified LR/LT Table (German) 0: very low 6: very high http://www.meta-net.eu 36
  • 37. Text Analytics Coarse-Grained View excellent good moderate fragmentary English (4.50) Dutch (3.94) French (3.71) German (3.36) Italian (3.50) Spanish (3.77) Basque (3.36) Bulgarian (2.80) Catalan (3.21) Czech (3.29) Danish (3.00) Finnish (3.64) Galician (3.43) Greek (2.71) Hungarian (3.79) Norwegian (4.36) Polish (4.07) Portuguese (3.64) Romanian (3.87) Slovak (2.43) Slovene (3.57) Swedish (4.57) weak or no support Croatian (2.43) Estonian (3.14) Icelandic (3.50) Irish (3.71) Latvian (3.14) Lithuanian (1.79) Maltese (0.80) Serbian (1.64) In parenthesis: average scores of the grammatical analysis feature. Several additional categories and features informed and influenced the overall ranking of a language in one of the five categories. Neither the individual scores nor the avg. scores have been calibrated with regard to the scores assigned to the LT support of other languages. These scores cannot be used for a cross-language comparison alone; nevertheless, the avg. scores show how the authoring teams perceive the state of the grammatical analysis category for their respective language themselves. 37
  • 38. “Grammatical Analysis” Feature Language   Basque   Bulgarian   Catalan   CroaBan   Czech   Danish   Dutch   English   Estonian   Finnish   French   Galician   German   Greek   Hungarian   Icelandic   Irish   Italian   Latvian   Lithuanian   Maltese   Norwegian   Polish   Portuguese   Romanian   Serbian   Slovak   Slovene   Spanish   Swedish   QuanBty   4   2.4   3   2   4   3   3.6   5   2.5   3.5   4   3   4   2   4.5   2   4   3.5   2.5   2   0.8   4   4   3   4   1   2   2.5   3.5   4.5   http://www.meta-net.eu Availability   2.5   2   2.5   1.5   2   2   5.4   5   3.5   3.5   4   5   2.5   1.5   2   5.5   4   3   2   1.5   0.8   4.5   4.5   3   3.5   1   2   4   3   3.5   Quality   4   3.6   4   3.5   4   4   4.8   5.5   3.2   3.5   4   4   4   3.5   4   4   3   4   3   2.5   0.8   4   4.5   4   4   2.5   3   4.5   5.4   5   Coverage   4   3.6   4   3   4   4   3.6   4.5   2.8   4   4   4   4   3   4.5   3   3   5   3.5   2   0.8   4   4.5   4   3.6   2   2   3.5   4.5   4   Maturity   4   2.8   4   2   3   3   4.8   4.5   4   4   4   3   4   3   4   3.5   4   4   4   1.5   0.8   4.5   4   4.5   4.5   2   2   3   3.5   5   Sustainability   Adaptability   2.5   2.4   2.5   1   2   2   3.6   3   2.5   3.5   3   2   2.5   3   3   3.5   4   3   3   1   0.8   4.5   4   2.5   3.5   1.5   3   3   3   5   2.5   2.8   2.5   4   4   3   1.8   4   3.5   3.5   3   3   2.5   3   4.5   3   4   2   4   2   0.8   5   3   4.5   4   1.5   3   4.5   3.5   5   Average   3.36   2.80   3.21   2.43   3.29   3.00   3.94   4.50   3.14   3.64   3.71   3.43   3.36   2.71   3.79   3.50   3.71   3.50   3.14   1.79   0.80   4.36   4.07   3.64   3.87   1.64   2.43   3.57   3.77   4.57   Level  of  support   (Text  AnalyBcs)   fragmentary   fragmentary   fragmentary   weak/no   fragmentary   fragmentary   moderate   good   weak/no   fragmentary   moderate   fragmentary   moderate   fragmentary   fragmentary   weak/no   weak/no   moderate   weak/no   weak/no   weak/no   fragmentary   fragmentary   fragmentary   fragmentary   weak/no   fragmentary   fragmentary   moderate   fragmentary   38
  • 39. Across Categories q  q  q  q  The four area rankings of the 30 languages on the five point scale (from “excellent support” to “weak/no support”) take many different features and factors into account. The “grammatical analysis” data are only one single piece of the puzzle – the piece that is closest to Computational Morphology. Let’s have a look at the individual White Papers and the languages as they are ranked – from “good support” to “weak/no” support. The following ranking is in terms of Text Analytics, the excerpts taken from the White Papers refer to morphological tools. http://www.meta-net.eu 39
  • 40. Good Support q  q  q  q  Only language that is considered to have “good support” in terms of Text Analytics is English. In comparison to certain other languages and language families, the morphology of English is usually considered as being rather simple and straight-forward. Many robust and precise off-the-shelf technologies exist. This is most probably the main reason why the authors of the white paper on English do not discuss morphology components at all, nor any related issues or challenges. http://www.meta-net.eu 40
  • 41. Moderate Support q  q  q  q  Same trend in this category concerning morphological tools. Authors mainly discuss other research and technology gaps, mentioning the existence of, for example, “medium- to high-quality software for basic text analysis, such as tools for morphological analysis and syntactic parsing” (German), Some authors mention morphology on a more superficial level (Italian, Spanish) or not at all (Dutch). The authors of the white paper on French emphasise that large programmes were set up (1994–2000; 2003–2005) to build a set of basic technologies for French, from spoken and written language resources to spoken and written language processing systems. http://www.meta-net.eu 41
  • 42. Fragmentary Support 1/4 q  q  q  q  q  16 languages only have fragmentary support in Text Analytics. The respective authoring teams report the existence of one or two morphological tools per language. Clear tendency: these tools have limited functionality and a long history including an unclear copyright situation (Hungarian). Neither freely nor immediately available (Danish, Romanian). However, these tools are usually employed in the large office suites (MS Office, Open Office), localisation frameworks or national search engines (Norwegian, Czech, Slovak). http://www.meta-net.eu 42
  • 43. Fragmentary Support 2/4 q  q  q  q  Key contributing factor that only few morphological components exist: rich morphological systems; high degree of inflection; lack of morphological distinction for certain nominal cases. These linguistic properties make morphological processing, as well as all approaches based primarily on statistics, a challenge (Basque, Polish, Slovene and other languages). Special characters and encoding systems are mentioned for languages with alphabets that go beyond plain ASCII: processing words when diacritics are missing (web, email) is a challenge. Experts demand more robust error detection algorithms (Czech). Important observation (Basque, Greek): algorithms and approaches developed for English cannot be directly transferred to other languages. http://www.meta-net.eu 43
  • 44. Fragmentary Support 3/4 q  q  q  Languages spoken in smaller countries usually do not receive as much attention and research funding as larger languages in which typically also a larger base of researchers works on building actual technologies, maybe even breaking new ground (Greek). Hungarian: a lack of synchronisation between parallel efforts to build morphological processors lead to substantial friction loss. This is why several morphological parsers for Hungarian exist but they use conflicting and incompatible formalisms. Some authors discuss related technologies such as, for example, elearning tools and systems for second language learners that employ complex morphological components (Czech). http://www.meta-net.eu 44
  • 45. Fragmentary Support 4/4 q  q  q  q  Portugal set up a project in 2005 to enable the development of a set of linguistic resources and components to support the processing of Portuguese. Outcome: large corpus and tools for tokenisation, morphosyntactic tagging, inflection analysis, and lemmatisation. Slovakia set up a project to provide processing of Slovak for linguistic research purposes within the National Research and Development Programme. Outcome: tools and data sets that include processors and morphologically annotated corpora. French (1994-2000) had a clear head-start over Portuguese and Slovak in addition to a longer, more established research tradition in this area, which is why it was ranked higher. The Slovak experts conclude that, while certain morphological tools do exist, “those must be further developed and supported.” http://www.meta-net.eu 45
  • 46. Weak or No Support 1/2 q  q  q  q  q  This category concerns eight languages. A small or very small number of morphological tools or components exist (Irish) and are used, even in well known applications, but they are neither freely available nor accessible for research purposes. Tools are based on very simple approaches that rely on word lists (Lithuanian, Estonian, Croatian). Several of these tools have been in development since the 1980ies and are under the control of companies. Researchers often use ispell or aspell (open source) as a technological fallback solution. The complex morphology of languages is mentioned in almost all cases along with the statement that morphology processing must be further developed (Icelandic, Estonian, Croatian, Maltese, Serbian). http://www.meta-net.eu 46
  • 47. Weak or No Support 2/2 q  q  q  q  Authors demand more development for basic morphological tools. Perceived as very important: to design and model approaches to the specific linguistic properties of a language without trying to adapt an approach developed for English (Serbian, Estonian). One such step is to set up specific language technology programmes, as has been done, among others, in France, Slovakia and Portugal. In 2000, the Icelandic government set up a national programme with the aim of supporting institutions and companies in creating resources for Icelandic. Outcome: several projects, huge impact on the national field. Among its results are a full-form morphological database of Modern Icelandic inflections, a balanced morphosyntactically tagged corpus and a training model for datadriven POS taggers and an improved spell checker. http://www.meta-net.eu 47
  • 48. Summary q  q  q  q  q  q  q  Solid computational morphology tools only exist for a handful of European languages – i.e., those with many speakers (and funding). The smaller the language, the less tools exist. “Fragmentary” or “weak/no support”: 24 of the 30 languages – very few tools; very limited functionality; availability is a problem. In terms of the full NLP stack, computational morphology cannot be taken for granted, it is by no means a “solved problem”. More original research off the beaten track (i.e., English) needed. More coordination, synergies and research transfer between the languages needed. France, Iceland, Portugal, Slovakia show that large, dedicated funding programmes are needed to support the development of basic LRs/LTs. http://www.meta-net.eu 48
  • 49. The META-NET Strategic Research Agenda for Multilingual Europe Strategic Research Agenda http://www.meta-net.eu 49
  • 50. Three Ingredients Appropriate Actors Appropriate Programme Research & Commercialisation Vision & Agenda Appropriate Support Funding http://www.meta-net.eu 50
  • 51. Three Vision Groups q  Translation and Localisation (technical documentation, official bulletins, GUI localisation, games, services etc.) §  Target stakeholders: large users of translation services, (machine) translation, software companies, game companies, localisation industry q  Media and Information Services (audiovisual sector, news, digital libraries, portals, search engines etc.) §  Target stakeholders: media industries, search engine providers, archives q  Interactive Systems (mobile assistance, dialogue translation, call centres, etc.) §  Target stakeholders: mobile software and service providers, telecom industry, call centres http://www.meta-net.eu 51
  • 52. Vision Group Meetings q  Vision Group Translation and Localisation §  July 23, 2010 §  September 28, 2010 §  April 7/8, 2011 q  Vision Group Media and Information Services §  September 10, 2010 §  October 15, 2010 §  April 1, 2011 q  Berlin, Germany Brussels, Belgium Prague, Czech Republic Paris, France Barcelona, Spain Vienna, Austria Vision Group Interactive Systems §  September 10, 2010 §  October 5, 2010 §  March 28, 2011 http://www.meta-net.eu Paris, France Prague, Czech Republic Rotterdam, The Netherlands 52
  • 53. Planning Process Expert meeting minutes Expert meeting minutes Vision Group Media and Information Services Report Vision Group Interactive Systems Report Expert meeting minutes Vision Paper Strategic Research Agenda Vision Group Translation and Localisation Report 2010 Priority Themes Paper 2011 2012
  • 54. Planning Process: Documents Expert meeting minutes This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”, co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119. LT 2020 A Network of Excellence forging the Vision and Priority Themes for Language Technology Research in Europe until the Year 2020 Multilingual Europe Technology Alliance Expert meeting minutes Vision Document Vision Group Translation and Localisation Results of first two meetings Editors: Dissemination Level: Public Date: Towards the META-NET Strategic Research Agenda Aljoscha Burchardt, Georg Rehm 3 December 2010 Expert meeting minutes Vision Group Media and Information Services Report This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”, co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119. A Network of Excellence forging the Priority Themes Paper Do you have com with reg me ard to the nts, ideas or sugges Please conten tions send the t of this discuss docum them onl m to office@ ent? meta-n ine: htt et.eu or p://ww w.meta -net.eu /sra. The development of this paper has been funded by the Seventh Framework Programme and the ICT Policy Support Programme of the European Commission under contracts T4ME (Grant Agreement 249119), CESAR (Grant Agreement 271022), METANET4U (Grant Agreement 270893) and META-NORD (Grant Agreement 270899). Multilingual Europe Technology Alliance Vision Document Vision Group Interactive Systems: Results of first two meetings www.meta-net.eu office@meta-net.eu T: +49 30 23895 1833 Editors: Joseph Mariani, Bernardo Magnini Dissemination Level: Public Date: 28 December 2010 Vision Group Interactive Systems Report This document is part of the Network of Excellence “Multilingual Europe Technology Alliance (META-NET)”, co- funded by the 7th Framework Programme of the European Commission through the T4ME grant agreement no.: 249119. A Network of Excellence forging the Multilingual Europe Technology Alliance The Future European Multilingual Information Society Vision Paper Vision Paper for a Strategic Research Agenda Vision Document Vision Group Media and Information Services: Results of first two meetings Editors: Maria Koutsombogera, Stelios Piperidis Dissemination Level: Public Date: 10 November 2010 Vision Group Translation and Localisation Report 2010 “People can’t share knowledge if they don’t speak a common language.” Davenport, Thomas H, and Laurence Prusak, Working Knowledge: How Organizations Manage What They Know, Harvard Business School, Boston, 1997, p. 98. Join the discussion at www.meta-et.eu/forum 2011 Strategic Research Agenda 2012
  • 55. Preparation of the SRA q  q  q  Strategic Research Agendas of other initiatives were screened. Many suggestions as input from Vision Group members. We discussed procedures, input and structure of the SRA in four meetings of the META Technology Council. §  Brussels, Belgium, November 16, 2010 §  Venice, Italy, May 25, 2011 §  Berlin, Germany, September 30, 2011 §  Brussels, Belgium, June 19, 2012 q  Additional input in talks, meetings, workshops, discussions, etc. §  Example: Three HLT Expert Meetings organised by the EC (end of 2011) q  Almost 200 experts contributed to the SRA (54% from industry; 46% from research; 4% from national/international institutions). http://www.meta-net.eu 55
  • 56. Strategic Research Agenda q  q  q  q  q  Addresses the problems we identified when preparing the white papers. Three priority research themes and application/innovation scenarios. Can put Europe ahead of its competitors in this technology area. >190 contributors; >2 years. Presented and discussed at 83 conferences and major workshops. q  Final version ready on Dec. 1, 2012. q  http://www.meta-net.eu/sra http://www.meta-net.eu 56
  • 57. SRA: Contents – Brief Glimpse q  q  Set the stage and describe the European situation, the needs and the LT research and industry. Discuss the state of IT, predictions and mega-trends. q  Our technology vision for 2020. q  Select and specify priority themes. q  q  Suggest a model for speeding up innovation. Outline proposals for the organisation of research and innovation. http://www.meta-net.eu 57
  • 58. Priority Themes: 3 + 2 q  We decided on priority themes that (a) support technology progress, (b) lead to solutions that European society needs and (c) solutions from which European industry will benefit as users or as providers. §  Translingual Cloud §  Social Intelligence and e-Participation §  Socially-Aware Interactive Assistants q  Two additional themes: §  European Service Platform for Language Technologies §  Core Technologies for Language Analysis and Production http://www.meta-net.eu 58
  • 59. PT1: Translingual Cloud q  q  q  Europe has a big need for translations of publishable quality. Focus on high-quality translation. New research paradigms §  Inclusion of professional translators into the research process §  Inclusion of technologists into research on human translation processes q  Different technological approaches §  Stronger emphasis on the properties of individual languages §  A central role for semantics q  Methods for specific genres & domains http://www.meta-net.eu 59
  • 60. Priority Research Theme 1: Translingual Cloud Written (twitter, blog, article, newspaper, text with/without metadata etc.) or spoken input (spontaneous spoken language, video/audio, multiple speakers) Extending translation with semantic data and linked open data Modular combination of analysis, transfer and generation models From very fast but lower quality to slower but very high quality (including instant quality upgrades) Services and Technologies: Automatic translation and interpretation Language checking Post-editing Workbenches for creative translations Novel translation and authoring workflows Quality assurance Computer-supported human translation Multilingual content production and text authoring Trusted service centre (privacy, confidentiality, security of source data) Exploiting strong monolingual analysis and generation methods and resources Multiple target formats Domain, task and genre specialisation models Applications: Crosslingual communication, translation and search Real-time subtitling, voice-over generation and translating speech from live events Mobile interactive interpretation Any device Target groups: European citizen, language professional, organisations, companies, European institutions, software applications Multilingual content production (media, web, technical, legal documents) Showcases: translingual spaces for ambient translation Multiple target formats Single access point
  • 61. PT2: Social Intelligence q  q  q  q  q  q  q  q  Better decisions by monitoring social media Inclusion of citizens into collective decision processes Opinion formation, consensus building, decision making Evolution of new solutions New forms of democracy: e-democracy, massive participation, transparency Dialogues and debates across language boundaries and across parties, political alliances, social classes Better than binary voting Documented transparent decision processes http://www.meta-net.eu 61
  • 62. Priority Research Theme 2: Social Intelligence and e-Participation Mapping large, heterogeneous, unstructured volumes of online content to structured, actionable representations From shallow to deep, from coarse-grained to detailed processing techniques Making language technologies interoperable with knowledge representation and the semantic web “Semantification” of the web: tight integration with the Semantic Web and Linked Open Data Services and Technologies: especially social media, comments, blogs, forums decision-relevant information sentiment analysis and opinion mining including the temporal dimension) cues from arbitrary online content visualising discussions and opinion statements support Applications: Make use of the wisdom of the crowds and processes; modeling evolution of opinions collective deliberation and e-participation wide deliberation on pressing issues Unleashing social intelligence by detecting and monitoring opinions, demands, needs and problems - analysis technologies Target groups: European citizen, European institutions, discussion participants, companies Improved efficiency and quality of decision processes Understanding influence diffusion across social media
  • 63. Priority Research Theme 3: Socially-Aware Interactive Assistants ments, any vocabulary recovery, selfassessment Multilingual capabilities Interacting naturally with and in groups Include human-computer, human-artificial agent and computer-mediated humanhuman communication Learning and forgetting information Adaptable to the user’s needs and preferences and the environment Services and Technologies: recognition understanding inter-dependencies and synthesis, providing expressive voices incremental conversational speech priority themes models of human communication Applications: dialogue systems modalities (visual, tactile, haptic) environment Proactive, self-aware, user-adaptable Interacts naturally with humans, in any language and modality Can be personalised to individual communication abilities including special needs verbal/non-verbal behaviour, social context Can learn incrementally from all interactions and other sources of information
  • 64. Providers of operational and research technologies and services National Language Institutions Language Service Providers Priority Research Theme 1: Translingual Cloud Language Processing Language Technology Providers Universities Priority Research Theme 2: Social Intelligence & e-Participation European Institutions Priority Research Theme 3: Socially Aware Interactive Assistants European Service Platform for Language Technologies (Cloud or Sky Computing Platform) Language Understanding Text analytics Multilingual technologies Text generation Information and relation extraction Knowledge Emotion/ Sentiment Language checking Sentiment analysis Named entity recognition Other companies (SMEs, startups etc.) Summarisation Knowledge access and management Data protection Tools Data Sets Resources Components Metadata Standards Interfaces APIs Catalogues Quality Assurance Data Import/Export Input/Output Storage Performance Availability Scalability Interfaces (web, speech, mobile etc.) Beneficiaries/users of the platform European Institutions Research Centres Public Administrations European Citizens Enterprises LT User Industries Universities Features Research Centres
  • 65. Core Resources & Technologies Icelandic Icelandic Finnish Finnish Norwegian Norwegian Estonian Swedish Estonian Swedish Lithuanian Danish Irish Latvian Polish Latvian Lithuanian Danish Irish Slovak English English Polish Dutch German Dutch Romanian Slovak Czech German Galician Hungarian Slovene Croatian Basque Portuguese Croatian French Serbian Basque Serbian Catalan Hungarian Slovene Romanian French Galician Czech Bulgarian Bulgarian Italian Catalan Portuguese Spanish Greek Spanish Italian Greek Maltese http://www.meta-net.eu Maltese 65
  • 66. META-NET Conclusions and Next Steps http://www.meta-net.eu 66
  • 67. Conclusions and Next Steps q  q  q  q  q  q  The White Paper Series clearly shows that Computational Morphology cannot and must not be considered a “solved problem”. Quite the contrary: several good technologies exist only for a small number of languages; many languages lack adequate support. The research community needs to team up to discuss synergies and to boost research and technology transfer between its languages. The goal should be adequate, precise, robust, scalable and freely available morphology components for all European languages. New challenges and opportunities: real-time processing, web-scale processing of and training on documents using big data technologies such as Hadoop, interoperability and standardisation of data formats, morphology as a service etc. The sophisticated applications foreseen in our META-NET SRA are critically dependent on reliable and precise basic processing components — including computational morphology! http://www.meta-net.eu 67
  • 68. Conclusions and Next Steps q  q  q  q  q  q  q  q  Europe is extremely interested in and passionate about its languages. Our Strategic Research Agenda for LT research and innovation can put Europe ahead of its competitors in this technology area. Provides useful and attractive solutions to European society, at the same time creating huge business opportunities for European industry. Now is the time to move forward with a continent-wide, systematic push and to invest in strategic research. A modest investment is required. We are very confident that we can help build applications that break down language barriers in Europe and beyond. This push will generate a countless number of opportunities. This year is important: H2020 and CEF can provide sufficient resources to make our visions for Europe’s citizens and economy a reality. META-FORUM 2013, September 19/20, Berlin, Germany. http://www.meta-net.eu 68
  • 69. http://www.meta-net.eu Connecting Europe for New Horizons Vision Group Translation and Localisation Vision Group Interactive Systems Economics and Technology — Berlin, Germany 2010 http://www.meta-forum.eu Register now! META-NET Website Vision Group Media and Information Services META-FORUM 2013 — Connecting Europe for New Horizons is an international conference on powerful language technologies for the multilingual information society, the data value chain and the information market place. The two special themes of this year's edition of the conference are Big Data Text Analytics and Multilingual Web Services for Multilingual Europe. 2011 Highlights Keynote lectures by Daniel Marcu (Chief Science Officer, SDL) and Wolfgang Wahlster (CEO, German Research Center for Artificial Intelligence, DFKI) Horizon 2020 and Connecting Europe Facility (CEF): Current State of Play Dynamic Discussions on: Technologies for the Multilingual Web MT for Professionals Services for Multilingual Europe Needs of Europe's Languages Connecting Towards New Horizons Quality Translation and Innovation New Stakeholders: GALA (Globalization and Localization Association); NPLD (Network to Promote Linguistic Diversity); Council of Europe Committee of Experts on the Charter of Regional and Minority Languages Towards a European Language Technology Platform Panel discussions Awards Ceremony: META Prize and META Seal of Recognition META Exhibition (industry and research exhibition – software demos and posters) 2012 Language White Paper Series 2013 Strategic Research Agenda META-FORUM 2013 will be held jointly by META-NET and the German Federal Ministry of Economics and Technology, co-organised with MultilingualWeb-LT, QTLaunchPad and LT Berlin. Horizon 2020 Conne Deliverin 2014-2020 Transport Energy Connect http://www.meta-net.eu
  • 70. Q/A Acknowledgements: This work would not have been possible without the dedication and commitment of our colleagues Aljoscha Burchardt, Kathrin Eichler, Tina Klüwer, Arle Lommel, Felix Sasaki and Hans Uszkoreit (all DFKI), the 60 member organisations of the META-NET network of excellence, the ca. 70 members of the Vision Groups, the ca. 30 members of the META Technology Council, the more than 200 authors of and contributors to the META-NET Language White Paper Series and the ca. 200 representatives from industry and research who contributed to the META-NET Strategic Research Agenda. Thank you! META-FORUM 2013 September 19/20, Berlin http://www.meta-forum.eu http://www.meta-net.eu http://www.facebook.com/META.Alliance Connecting Europe for New Horizons Economics and Technology — Berlin, Germany http://www.meta-forum.eu Register now! META-FORUM 2013 — Connecting Europe for New Horizons is an international conference on powerful language technologies for the multilingual information society, the data value chain and the information market place. The two special themes of this year's edition of the conference are Big Data Text Analytics and Multilingual Web Services for Multilingual Europe. Highlights Keynote lectures by Daniel Marcu (Chief Science Officer, SDL) and Wolfgang Wahlster (CEO, German Research Center for Artificial Intelligence, DFKI) Horizon 2020 and Connecting Europe Facility (CEF): Current State of Play Dynamic Discussions on: Technologies for the Multilingual Web MT for Professionals Services for Multilingual Europe Needs of Europe's Languages Connecting Towards New Horizons Quality Translation and Innovation New Stakeholders: GALA (Globalization and Localization Association); NPLD (Network to Promote Linguistic Diversity); Council of Europe Committee of Experts on the Charter of Regional and Minority Languages Towards a European Language Technology Platform Panel discussions Awards Ceremony: META Prize and META Seal of Recognition META Exhibition (industry and research exhibition – software demos and posters) META-FORUM 2013 will be held jointly by META-NET and the German Federal Ministry of Economics and Technology, co-organised with MultilingualWeb-LT, QTLaunchPad and LT Berlin. http://www.meta-net.eu 70