SlideShare una empresa de Scribd logo
1 de 13
Descargar para leer sin conexión
Terminology work and term databases in Estonia
With emphasis on termbase data structures
Arvi Tavast, PhD
qlaara
Riga, 4 November 2015
Lexicography Terminology What’s wrong Quantitative
Introduction
From Estonian terminology to termbase data structures
We used to have specialised lexicography that people
affectionately called terminology
Then we had a bit of terminology
(even applied to general language)
There were calls for a unified termbase of all terms
Which is unfortunately not doable:
coverage
reliability
lack of convention
theoretical issues
The following presentation gives a bit more detail
Lexicography Terminology What’s wrong Quantitative
Outline
1 Lexicography: semasiological data structures
2 Terminology: onomasiological data structures
3 What’s wrong
Data structures
Metaphors of communication
4 Quantitative dictionary data structures
Data structures
Division of labour
Lexicography Terminology What’s wrong Quantitative
Semasiological data structures
Words and what they mean
en: table
1. a piece of furniture with four legs and a flat top
de: Tisch
2. layout of data in rows and columns
de: Tabelle
en: desk
- an office table
de: Tisch
de: Schreibtisch
en: spreadsheet
- a data layout consisting of rows and columns
de: Tabelle
de: Arbeitsblatt
Lexicography Terminology What’s wrong Quantitative
Onomasiological data structures
Concepts and how they are called
1 A piece of furniture with four legs and a flat top, for eating
en: table
de: Tisch
2 A piece of furniture with four legs and a flat top, for writing
en: desk
de: Tisch
de: Schreibtisch
3 Layout of data in rows and columns
en: table
en: spreadsheet
de: Tabelle
de: Arbeitsblatt
Lexicography Terminology What’s wrong Quantitative
Example
Latvian-Estonian dictionary
Lexicography Terminology What’s wrong Quantitative
Example
Latvian-Estonian dictionary
Lexicography Terminology What’s wrong Quantitative
What’s wrong
Data structures
Semasiology
Pro: easy for the editor, understandable for the reader
Con: no support for consistency
A narrative about the editor, not a data source about language
Onomasiology
Pro: consistency, scalability, standardisation
Con: need for explicit binary decisions
An oversimplified data source about language; works if
concepts are known
Both
Binary: either means or does not mean, there is no scale
Introspective: claims are not falsifiable
Simplistic: assume the concepts are (or can be) known
The channel metaphor of communication
Lexicography Terminology What’s wrong Quantitative
What’s wrong
The channel metaphor vs uncertainty reduction
Encoding of a message must contain a set of discriminable
states that is greater than or equal to the number of
discriminable states in the to-be-encoded message
or:
Encoding thoughts with words can only work if the number of
possible thoughts is smaller than or equal to the number of
possible words
This is the case only in very restricted domains (e.g. weather
forecasts)
Ramscar, M. et al. 2010. The Effects of Feature-Label-Order and Their Implications
for Symbolic Learning. Cognitive Science 34(6): 909–957.
Lexicography Terminology What’s wrong Quantitative
Quantitative data structures
Words (lexomes), their relatedness and other numerical parameters
Empirical data sources, rather than introspective
Corpus research, frequencies, collocations, distributional
semantics
Human experimental judgements
NB Meaning is inherently introspective, not measurable.
Relative meaning is measurable
Quantified data, rather than binary
Types of relatedness: synonyms, equivalents, cohyponyms, etc.
Other numerical parameters: frequency, valence, emotion,
reaction times, naming latencies, neighbourhood density,
relative entropy, median absolute deviation, morphological
distribution, search statistics etc.
Lexicography Terminology What’s wrong Quantitative
Quantitative data structures
Relatedness can be quantified and presented as a graph or a table
table1 table2 desk spreadsheet Tisch Schreibtisch Tabelle Arbeitsblatt
table1 1 0 0.1 0 0.6 0.4 0 0
table2 0 1 0 0.5 0 0 0.8 0.8
desk 0.1 0 1 0 0.6 0.8 0 0
spreadsheet 0 0.5 0 1 0 0 0.7 0.8
Tisch 0.6 0 0.6 0 1 0.8 0 0
Schreibtisch 0.4 0 0.8 0 0.8 1 0 0
Tabelle 0 0.8 0 0.7 0 0 1 0.8
Arbeitsblatt 0 0.8 0 0.8 0 0 0.8 1
Fictional data for demonstration purposes only
Lexicography Terminology What’s wrong Quantitative
Division of labour
Dumb user, smart dictionary vs smart user, dumb dictionary
A smart dictionary provides the correct answers
A dumb dictionary provides hints, like a thesaurus or synonym
dictionary
A dumb user looks for definite answers
A smart user can figure out the answer based on even subtle
hints
Lexicography Terminology What’s wrong Quantitative
Thanks for listening
Contacts and recommended reading
Slides:
www.slideshare.net/arvitavast
Contact:
arvi@qlaara.com
Easy reading:
blog.qlaara.com
Pointer to the real stuff:
Ramscar, M. et al. 2010. The Effects of
Feature-Label-Order and Their Implications for Symbolic
Learning. Cognitive Science 34(6): 909–957

Más contenido relacionado

Destacado

Trabajo movie maker jose maria y jesus rueda
Trabajo movie maker  jose maria y jesus ruedaTrabajo movie maker  jose maria y jesus rueda
Trabajo movie maker jose maria y jesus ruedajesusrueda rueda
 
Best seo company in india infos india
Best seo company in india  infos indiaBest seo company in india  infos india
Best seo company in india infos indiaNick Sharma
 
An Introduction To Mobile Software Testing
An Introduction To Mobile Software TestingAn Introduction To Mobile Software Testing
An Introduction To Mobile Software TestingStephen Janaway
 
Kingdom Security Case Study
Kingdom Security Case StudyKingdom Security Case Study
Kingdom Security Case StudyScott Walker
 

Destacado (10)

Trabajo movie maker jose maria y jesus rueda
Trabajo movie maker  jose maria y jesus ruedaTrabajo movie maker  jose maria y jesus rueda
Trabajo movie maker jose maria y jesus rueda
 
My c.v
My c.vMy c.v
My c.v
 
Best seo company in india infos india
Best seo company in india  infos indiaBest seo company in india  infos india
Best seo company in india infos india
 
Curriculo y virtualidad
Curriculo y virtualidadCurriculo y virtualidad
Curriculo y virtualidad
 
Compu
CompuCompu
Compu
 
Summer
SummerSummer
Summer
 
Filming schedule ig2_in one
Filming schedule ig2_in oneFilming schedule ig2_in one
Filming schedule ig2_in one
 
An Introduction To Mobile Software Testing
An Introduction To Mobile Software TestingAn Introduction To Mobile Software Testing
An Introduction To Mobile Software Testing
 
Kingdom Security Case Study
Kingdom Security Case StudyKingdom Security Case Study
Kingdom Security Case Study
 
A Tale of One City
A Tale of One CityA Tale of One City
A Tale of One City
 

Similar a Terminology work and term databases in Estonia

Chapter 2 Text Operation and Term Weighting.pdf
Chapter 2 Text Operation and Term Weighting.pdfChapter 2 Text Operation and Term Weighting.pdf
Chapter 2 Text Operation and Term Weighting.pdfJemalNesre1
 
Chapter 2: Text Operation in information stroage and retrieval
Chapter 2: Text Operation in information stroage and retrievalChapter 2: Text Operation in information stroage and retrieval
Chapter 2: Text Operation in information stroage and retrievalcaptainmactavish1996
 
Automatic Profiling Of Learner Texts
Automatic Profiling Of Learner TextsAutomatic Profiling Of Learner Texts
Automatic Profiling Of Learner TextsJeff Nelson
 
Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Cornelius Puschmann
 
2010-04-29-swnj-pcls-presentation
2010-04-29-swnj-pcls-presentation2010-04-29-swnj-pcls-presentation
2010-04-29-swnj-pcls-presentationDouglas Randall
 
Web classification of Digital Libraries using GATE Machine Learning  
Web classification of Digital Libraries using GATE Machine Learning  	Web classification of Digital Libraries using GATE Machine Learning  
Web classification of Digital Libraries using GATE Machine Learning   sstose
 
Customizable Segmentation of
Customizable Segmentation ofCustomizable Segmentation of
Customizable Segmentation ofAndi Wu
 
Analysis And Indexing General Terms Experimentation
Analysis And Indexing General Terms ExperimentationAnalysis And Indexing General Terms Experimentation
Analysis And Indexing General Terms ExperimentationAshley Hernandez
 
Themes identification techniques in qualitative research
Themes identification techniques in qualitative researchThemes identification techniques in qualitative research
Themes identification techniques in qualitative researchGhulam Qambar
 
Text mining introduction-1
Text mining   introduction-1Text mining   introduction-1
Text mining introduction-1Sumit Sony
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Saurabh Kaushik
 
The role of linguistic information for shallow language processing
The role of linguistic information for shallow language processingThe role of linguistic information for shallow language processing
The role of linguistic information for shallow language processingConstantin Orasan
 
13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for TranslationRIILP
 
16. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 116. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 1RIILP
 
Document Author Classification using Parsed Language Structure
Document Author Classification using Parsed Language StructureDocument Author Classification using Parsed Language Structure
Document Author Classification using Parsed Language Structurekevig
 

Similar a Terminology work and term databases in Estonia (20)

Chapter 2 Text Operation and Term Weighting.pdf
Chapter 2 Text Operation and Term Weighting.pdfChapter 2 Text Operation and Term Weighting.pdf
Chapter 2 Text Operation and Term Weighting.pdf
 
Chapter 2: Text Operation in information stroage and retrieval
Chapter 2: Text Operation in information stroage and retrievalChapter 2: Text Operation in information stroage and retrieval
Chapter 2: Text Operation in information stroage and retrieval
 
Distributional semantics
Distributional semanticsDistributional semantics
Distributional semantics
 
FinalDraftRevisisions
FinalDraftRevisisionsFinalDraftRevisisions
FinalDraftRevisisions
 
Automatic Profiling Of Learner Texts
Automatic Profiling Of Learner TextsAutomatic Profiling Of Learner Texts
Automatic Profiling Of Learner Texts
 
Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)
 
2010-04-29-swnj-pcls-presentation
2010-04-29-swnj-pcls-presentation2010-04-29-swnj-pcls-presentation
2010-04-29-swnj-pcls-presentation
 
Web classification of Digital Libraries using GATE Machine Learning  
Web classification of Digital Libraries using GATE Machine Learning  	Web classification of Digital Libraries using GATE Machine Learning  
Web classification of Digital Libraries using GATE Machine Learning  
 
Customizable Segmentation of
Customizable Segmentation ofCustomizable Segmentation of
Customizable Segmentation of
 
Analysis And Indexing General Terms Experimentation
Analysis And Indexing General Terms ExperimentationAnalysis And Indexing General Terms Experimentation
Analysis And Indexing General Terms Experimentation
 
Themes identification techniques in qualitative research
Themes identification techniques in qualitative researchThemes identification techniques in qualitative research
Themes identification techniques in qualitative research
 
Text mining introduction-1
Text mining   introduction-1Text mining   introduction-1
Text mining introduction-1
 
NLP todo
NLP todoNLP todo
NLP todo
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
 
The impact of standardized terminologies and domain-ontologies in multilingua...
The impact of standardized terminologies and domain-ontologies in multilingua...The impact of standardized terminologies and domain-ontologies in multilingua...
The impact of standardized terminologies and domain-ontologies in multilingua...
 
The role of linguistic information for shallow language processing
The role of linguistic information for shallow language processingThe role of linguistic information for shallow language processing
The role of linguistic information for shallow language processing
 
13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation
 
Ir 03
Ir   03Ir   03
Ir 03
 
16. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 116. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 1
 
Document Author Classification using Parsed Language Structure
Document Author Classification using Parsed Language StructureDocument Author Classification using Parsed Language Structure
Document Author Classification using Parsed Language Structure
 

Último

Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 

Último (20)

Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 

Terminology work and term databases in Estonia

  • 1. Terminology work and term databases in Estonia With emphasis on termbase data structures Arvi Tavast, PhD qlaara Riga, 4 November 2015
  • 2. Lexicography Terminology What’s wrong Quantitative Introduction From Estonian terminology to termbase data structures We used to have specialised lexicography that people affectionately called terminology Then we had a bit of terminology (even applied to general language) There were calls for a unified termbase of all terms Which is unfortunately not doable: coverage reliability lack of convention theoretical issues The following presentation gives a bit more detail
  • 3. Lexicography Terminology What’s wrong Quantitative Outline 1 Lexicography: semasiological data structures 2 Terminology: onomasiological data structures 3 What’s wrong Data structures Metaphors of communication 4 Quantitative dictionary data structures Data structures Division of labour
  • 4. Lexicography Terminology What’s wrong Quantitative Semasiological data structures Words and what they mean en: table 1. a piece of furniture with four legs and a flat top de: Tisch 2. layout of data in rows and columns de: Tabelle en: desk - an office table de: Tisch de: Schreibtisch en: spreadsheet - a data layout consisting of rows and columns de: Tabelle de: Arbeitsblatt
  • 5. Lexicography Terminology What’s wrong Quantitative Onomasiological data structures Concepts and how they are called 1 A piece of furniture with four legs and a flat top, for eating en: table de: Tisch 2 A piece of furniture with four legs and a flat top, for writing en: desk de: Tisch de: Schreibtisch 3 Layout of data in rows and columns en: table en: spreadsheet de: Tabelle de: Arbeitsblatt
  • 6. Lexicography Terminology What’s wrong Quantitative Example Latvian-Estonian dictionary
  • 7. Lexicography Terminology What’s wrong Quantitative Example Latvian-Estonian dictionary
  • 8. Lexicography Terminology What’s wrong Quantitative What’s wrong Data structures Semasiology Pro: easy for the editor, understandable for the reader Con: no support for consistency A narrative about the editor, not a data source about language Onomasiology Pro: consistency, scalability, standardisation Con: need for explicit binary decisions An oversimplified data source about language; works if concepts are known Both Binary: either means or does not mean, there is no scale Introspective: claims are not falsifiable Simplistic: assume the concepts are (or can be) known The channel metaphor of communication
  • 9. Lexicography Terminology What’s wrong Quantitative What’s wrong The channel metaphor vs uncertainty reduction Encoding of a message must contain a set of discriminable states that is greater than or equal to the number of discriminable states in the to-be-encoded message or: Encoding thoughts with words can only work if the number of possible thoughts is smaller than or equal to the number of possible words This is the case only in very restricted domains (e.g. weather forecasts) Ramscar, M. et al. 2010. The Effects of Feature-Label-Order and Their Implications for Symbolic Learning. Cognitive Science 34(6): 909–957.
  • 10. Lexicography Terminology What’s wrong Quantitative Quantitative data structures Words (lexomes), their relatedness and other numerical parameters Empirical data sources, rather than introspective Corpus research, frequencies, collocations, distributional semantics Human experimental judgements NB Meaning is inherently introspective, not measurable. Relative meaning is measurable Quantified data, rather than binary Types of relatedness: synonyms, equivalents, cohyponyms, etc. Other numerical parameters: frequency, valence, emotion, reaction times, naming latencies, neighbourhood density, relative entropy, median absolute deviation, morphological distribution, search statistics etc.
  • 11. Lexicography Terminology What’s wrong Quantitative Quantitative data structures Relatedness can be quantified and presented as a graph or a table table1 table2 desk spreadsheet Tisch Schreibtisch Tabelle Arbeitsblatt table1 1 0 0.1 0 0.6 0.4 0 0 table2 0 1 0 0.5 0 0 0.8 0.8 desk 0.1 0 1 0 0.6 0.8 0 0 spreadsheet 0 0.5 0 1 0 0 0.7 0.8 Tisch 0.6 0 0.6 0 1 0.8 0 0 Schreibtisch 0.4 0 0.8 0 0.8 1 0 0 Tabelle 0 0.8 0 0.7 0 0 1 0.8 Arbeitsblatt 0 0.8 0 0.8 0 0 0.8 1 Fictional data for demonstration purposes only
  • 12. Lexicography Terminology What’s wrong Quantitative Division of labour Dumb user, smart dictionary vs smart user, dumb dictionary A smart dictionary provides the correct answers A dumb dictionary provides hints, like a thesaurus or synonym dictionary A dumb user looks for definite answers A smart user can figure out the answer based on even subtle hints
  • 13. Lexicography Terminology What’s wrong Quantitative Thanks for listening Contacts and recommended reading Slides: www.slideshare.net/arvitavast Contact: arvi@qlaara.com Easy reading: blog.qlaara.com Pointer to the real stuff: Ramscar, M. et al. 2010. The Effects of Feature-Label-Order and Their Implications for Symbolic Learning. Cognitive Science 34(6): 909–957