Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Terminology work and term databases in Estonia
1. Terminology work and term databases in Estonia
With emphasis on termbase data structures
Arvi Tavast, PhD
qlaara
Riga, 4 November 2015
2. Lexicography Terminology What’s wrong Quantitative
Introduction
From Estonian terminology to termbase data structures
We used to have specialised lexicography that people
affectionately called terminology
Then we had a bit of terminology
(even applied to general language)
There were calls for a unified termbase of all terms
Which is unfortunately not doable:
coverage
reliability
lack of convention
theoretical issues
The following presentation gives a bit more detail
3. Lexicography Terminology What’s wrong Quantitative
Outline
1 Lexicography: semasiological data structures
2 Terminology: onomasiological data structures
3 What’s wrong
Data structures
Metaphors of communication
4 Quantitative dictionary data structures
Data structures
Division of labour
4. Lexicography Terminology What’s wrong Quantitative
Semasiological data structures
Words and what they mean
en: table
1. a piece of furniture with four legs and a flat top
de: Tisch
2. layout of data in rows and columns
de: Tabelle
en: desk
- an office table
de: Tisch
de: Schreibtisch
en: spreadsheet
- a data layout consisting of rows and columns
de: Tabelle
de: Arbeitsblatt
5. Lexicography Terminology What’s wrong Quantitative
Onomasiological data structures
Concepts and how they are called
1 A piece of furniture with four legs and a flat top, for eating
en: table
de: Tisch
2 A piece of furniture with four legs and a flat top, for writing
en: desk
de: Tisch
de: Schreibtisch
3 Layout of data in rows and columns
en: table
en: spreadsheet
de: Tabelle
de: Arbeitsblatt
8. Lexicography Terminology What’s wrong Quantitative
What’s wrong
Data structures
Semasiology
Pro: easy for the editor, understandable for the reader
Con: no support for consistency
A narrative about the editor, not a data source about language
Onomasiology
Pro: consistency, scalability, standardisation
Con: need for explicit binary decisions
An oversimplified data source about language; works if
concepts are known
Both
Binary: either means or does not mean, there is no scale
Introspective: claims are not falsifiable
Simplistic: assume the concepts are (or can be) known
The channel metaphor of communication
9. Lexicography Terminology What’s wrong Quantitative
What’s wrong
The channel metaphor vs uncertainty reduction
Encoding of a message must contain a set of discriminable
states that is greater than or equal to the number of
discriminable states in the to-be-encoded message
or:
Encoding thoughts with words can only work if the number of
possible thoughts is smaller than or equal to the number of
possible words
This is the case only in very restricted domains (e.g. weather
forecasts)
Ramscar, M. et al. 2010. The Effects of Feature-Label-Order and Their Implications
for Symbolic Learning. Cognitive Science 34(6): 909–957.
10. Lexicography Terminology What’s wrong Quantitative
Quantitative data structures
Words (lexomes), their relatedness and other numerical parameters
Empirical data sources, rather than introspective
Corpus research, frequencies, collocations, distributional
semantics
Human experimental judgements
NB Meaning is inherently introspective, not measurable.
Relative meaning is measurable
Quantified data, rather than binary
Types of relatedness: synonyms, equivalents, cohyponyms, etc.
Other numerical parameters: frequency, valence, emotion,
reaction times, naming latencies, neighbourhood density,
relative entropy, median absolute deviation, morphological
distribution, search statistics etc.
11. Lexicography Terminology What’s wrong Quantitative
Quantitative data structures
Relatedness can be quantified and presented as a graph or a table
table1 table2 desk spreadsheet Tisch Schreibtisch Tabelle Arbeitsblatt
table1 1 0 0.1 0 0.6 0.4 0 0
table2 0 1 0 0.5 0 0 0.8 0.8
desk 0.1 0 1 0 0.6 0.8 0 0
spreadsheet 0 0.5 0 1 0 0 0.7 0.8
Tisch 0.6 0 0.6 0 1 0.8 0 0
Schreibtisch 0.4 0 0.8 0 0.8 1 0 0
Tabelle 0 0.8 0 0.7 0 0 1 0.8
Arbeitsblatt 0 0.8 0 0.8 0 0 0.8 1
Fictional data for demonstration purposes only
12. Lexicography Terminology What’s wrong Quantitative
Division of labour
Dumb user, smart dictionary vs smart user, dumb dictionary
A smart dictionary provides the correct answers
A dumb dictionary provides hints, like a thesaurus or synonym
dictionary
A dumb user looks for definite answers
A smart user can figure out the answer based on even subtle
hints
13. Lexicography Terminology What’s wrong Quantitative
Thanks for listening
Contacts and recommended reading
Slides:
www.slideshare.net/arvitavast
Contact:
arvi@qlaara.com
Easy reading:
blog.qlaara.com
Pointer to the real stuff:
Ramscar, M. et al. 2010. The Effects of
Feature-Label-Order and Their Implications for Symbolic
Learning. Cognitive Science 34(6): 909–957