2. Truly a “Global” World
• We live in societies that require that we are able
to communicate across geography, culture, and
language.
• Being able to arrive at the same concept,
regardless of geography, culture or language is a
necessity in commerce and communication.
• Taxonomies and thesauri are the ways that we
organize and describe the world that we live in,
whether we are consciously aware of them or
not!
3. Building Multilingual Taxonomies
We will look at 3 approaches to building
and managing multilingual
taxonomies/thesauri in this presentation
and the pros and cons of each:
1. Single Vocabulary Method
2. Asymmetric Multilingual Vocabularies
3. Symmetric Multilingual Vocabularies
4. Single Vocabulary Method
• Using this method one effectively builds the
taxonomy settling on a primary, or dominant
language, and all structure is assigned based
on that language.
• All translations and associated translated
metadata are assigned as attributes of the
primary language term.
5. Single Vocabulary Method
The primary language as well
as each translation for the
term and associated metadata
are stored as attributes.
6. Single Vocabulary Method
• Hierarchical structure is determined by the primary
language.
• Consequently that languages also dictates cultural
and localization values as well.
7. Pros and Cons of the Method
Pros:
• Simplest of the three methods we will discuss
to design and maintain
• Least resource intensive to manage
Cons:
• Most limiting of the methods
• One language is dominant
• Synonyms may not vary across languages
8. Asymmetric Multilingual Vocabularies
• This method uses wholly independent, fully
structured taxonomies for each language with
concepts joined using equivalency (LE or EQ)
relationships.
• A single language may be selected as the
exchange language through which all languages
are linked
9. Asymmetric Multilingual Vocabularies
Though not always recommended, each
Vocabulary may be built using a completely
unique structure as well as number of
concepts to achieve full localization.
10. Pros and Cons of the Method
Pros:
• Provides for the most complete localization
• Each language may have a unique set of
attributes
• No one language is dominant
• New languages may be readily added
• Synonyms may vary across languages
Cons:
• Most resource intensive method to manage
• Less harmonized than the symmetric model
11. Symmetric Multilingual Vocabularies
• This model is strongly encouraged by the former and
current ISO standards (5964 and 25964-1)
• Every concept should have a Preferred Term (PT) in
each language
• All languages should share a common hierarchical
and associative structure
• Each language supports independent synonym sets
12. Symmetric Multilingual Vocabularies
•
•
There should be an instance of every preferred
term in all languages.
These terms may then be related via an
equivalency (LE or EQ) relationship or by making
them preferred labels to be applied to abstract
SKOS concepts.
13. Pros and Cons of the Method
Pros:
• Allows for management of unique attributes for each
language
• No one language is dominant
• Synonyms may vary across languages
• Much less intensive to manage because
all languages share a common structure
Cons:
• May not allow for subtle differences of language and
culture to be expressed through variations in concepts
and relational structure
14. Conclusions
• There are several options for managing multilingual vocabularies and each method possesses
some advantages and disadvantages.
• ISO Standards (25964-1) strongly recommend a
symmetric approach whenever possible.
• SKOS-XL provides an effective format that
supports the ISO symmetric model.
• One may employ an asymmetric method when
necessary, but beware the extra costs!
ISO25964:4.1 says the aim of a thesaurus ‘is to guide the indexer and the searcher to choose the same term for the same concept…’ This is the key idea as to why we create standards for managing multi-lingual vocabularies and what we’ll explore in the next few slides are some different approaches to achieving this.
You might want to verbalise that although there are slots for each language only the dominant language is used as the descriptor for the term record
*Multiple Monolingual vocabularies which are mapped to one another
*work on diagram
*every preferred term (concept) in one language should have an equivalent preferred term (concept) in all other languages. Binding language labels to an abstract SKOS concept.