SlideShare a Scribd company logo
1 of 28
Multilingual Information and Retrieval Systems
Technology and Applications
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Multilingual Information and Retrieval Systems
Technology and Applications
IMC Congress, Brussels 1993
Dr. Ulrich Kampffmeyer
• VOI Verband Optische Informationssysteme, Roßdorf / Darmstadt
German Association of Manufacturers and Resellers of Digital Optical Media,
Systems and Software (Chairman of the Board)
• PROJECT CONSULT Unternehmensberatung Dr. Ulrich Kampffmeyer GmbH
Wachenheim, Hamburg, Darmstadt
Abstract
This paper on multilingual information and retrieval systems with optical mass
storage describes the technical principles of software design. The different layers
and modules from the user interface via transformation modules, thesaurus modules
and fulltext interpretation to database management are explained in detail. Two
examples of multilingual document imaging systems are presented:
- wfBase multilingual press and commerce information system base on
four ISDN-knots in Switzerland;
- HEMIS multilingual information system for CD-ROM distribution on
environmental institutions, projects and programmes of the
UN Environmental Programme UNEP/HEM.
Contents Page
1. The Importance of Multilingual Software Systems With Optical Storage
Media for the European Economic Region
2
2. Software Design
3
2.1 Structural and Other Requirements for Multilingual Software
.................................................................................................................
3
2.2 User Interface and Application
.................................................................................................................
6
2.3 Transformation Modules
.................................................................................................................
9
2.4 Selection Lists
.................................................................................................................
11
IMC Congress, Brussels
© Copyright PROJECT CONSULT GmbH 1993 Page 1 of 28
Multilingual Information and Retrieval Systems
Technology and Samples
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
2.5 Thesauri
.................................................................................................................
12
2.6 Fulltext Translation
.................................................................................................................
16
3. Sample Applications
19
3.1 wfBase
.................................................................................................................
19
3.2 HEMIS
.................................................................................................................
23
4. Outlook and Summary
27
1. The Importance of Multilingual Software Systems With Optical Storage
Media for the European Economic Region
Europe 1993 is a catch-phrase that is often heard. But opening the borders and
removing trade barriers will not eliminate the cultural and language differences
between countries. These differences are a concern for all firms and organizations
that operate in more than one country.
Overcoming the language barrier is not simply a matter of lexical comprehension
and translation. It involves many levels of differing interpretations, meanings in
various contexts, and adaptation of specialized vocabulary. In business and
commerce, mere translation is not enough; the unwritten laws of the target specialist
language must be adhered to.
In addition, the organization working across national boundaries must take into
account differing units of measure, currency, and conventions (date formats,
addresses, orthography).
Multilingual software is a requirement wherever users require access to the same
information regardless of the nature of the source. This is particularly the case for:
- Trading firms
- Service firms
- International authorities and institutions
- Manufacturers with suppliers and subcontractors in more than one
country
- Communications firms
- Banks
- Insurance companies
- Authorities and bodies such as police, air-traffic control, disaster
relief organizations, environmental monitoring agencies, etc.
IMC Congress, Brussels
Page 2 of 28 © Copyright PROJECT CONSULT GmbH 1993
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Multilingual Informations and Retrieval Systems
Technology and Samples
- Others
English is often used as a de facto communications standard. However, the use of a
language that is foreign to its speakers can lead to misinterpretations and
misunderstandings when the user is not familiar with the exact meaning,
interrelationships, and contextual significance of terms and phrases. A "working
knowledge" of a language is not enough.
As software and the information underlying it become more complex, the support
provided by the software must become friendlier and more comprehensive. This is
especially true for the user interface, information on current actions, status
messages, context-sensitive information (especially with user mistakes or critical
program branchings) and help screens. The latter must be available in index form as
well as context-sensitive.
Modern "Windows"-oriented programs generally includes these features. However,
like most programs the information they contain is available in only one user
language.
Most standard software today comes from a few leading software houses in the
United States. Consequently the software and documentation is available in English,
or American English, first. The various language versions are then translated from
the original English version. The translations become available with more or less
delay in various release standards, depending on the relative importance of the
national market. In such standard software, the screens, associated texts, etc. are
contained in the main body of the program, making translation with adaptation of the
screens and texts a very complex undertaking.
Even when different users access identical information, they cannot change the user
language while the program is running. Instead, the complete target language
version must be started, at considerable cost in time. In addition, most standard
software lacks the integrated database or resource management components that
enable the administration of different language and function modules, not to speak of
the creation and maintenance of such modules.
Thus, "traditional" standard software has a built-in a bias against multilingual use.
This article will examine database information systems that are suitable for
multilingual applications.
2. Software Design
Like the ability to load modular program segments and functions separately,
multilingualism must be designed in from the start. It is well-nigh impossible to
modify finished software to support multilingual operation. In such cases, it makes
more sense to completely redesign the software using modern tools.
2.1 Structural and Other Requirements for Multilingual Software
Multilingual software is subject to the following design criteria (Fig. 1):
a) Modular design with clear logical and software partitioning of the various levels
(user interface, main program, resources, transformation modules, database,
etc.). Interaction is controlled through messages and global variables.
IMC Congress, Brussels
© Copyright PROJECT CONSULT GmbH 1993 Page 3 of 28
Multilingual Information and Retrieval Systems
Technology and Samples
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
b) No text components may be contained in the program segments responsible for
execution, but must be referenced by variables. The user can switch from one
language to another using a global variable.
c) Texts are kept in resource libraries and accessed by variables. The libraries must
be simple to maintain and the texts must be accessible and loadable in the
application during runtime.
d) All parts of an application must have defined interfaces. This is particularly the
case for the user interface, the actual application itself, the operating system and
all additional application modules.
e) The application, user interface and operating system must support variable text
field lengths and positions, since these can vary greatly from language to
language.
f) The application, operating system, screen and printer drivers, and the database
must support a variety of fonts, character sets, sortings, data formats, etc. This
requires that the underlying operating system support this.
Multilingual Software - Design Principles
Modular design with clear separation
of user interface, operating system and application (database)
Every text component has to be referenced by a key variable in the application
Resource libraries easy to link and to maintain (i.E. text editor)
Defined interfaces between the user interfaces, operating system
and application modules
Variable textfield positions and field-lengths in the user interface modules
of the application
Support of different sets of fonts, language specific characters,
keyboard layouts, date formats etc. by the underlying operating system
Fig 1: Multilingual Software Design Criteria
Such a multilingual application is thus divided into several inter-communicating
modules and levels (Fig. 2). The actual application program, which can be part of the
database, uses messages and global variables to control the language selection,
display and printing, and the search and conversion functions.
IMC Congress, Brussels
Page 4 of 28 © Copyright PROJECT CONSULT GmbH 1993
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Multilingual Informations and Retrieval Systems
Technology and Samples
Principles of Language Display during Runtime
Screen
Text Field
German
English
French
Spanish
Language Resources
Please select ...
Application
Resource Data
Data-
base
Selection
Lists
Thesauri
Trans-
formation
Modules
The language selector (Lx) in the application defines which
resource is used for display and how the information in the
datafield is represented
Screen
LX
LX
LX
Fig. 2: Language display during runtime of a multilingual program
The variable "Lx" ("Language Resource") determines which texts will be displayed,
and which transformation modules and selection lists will be used to control an entry
or search in a selected language. The information in the database itself is not
changed, but only the screen display and printout.
Figure 3 shows the levels of a multilingual application.
Language
Resources
Trans-
formation
Modules
Selection
Lists
Thesauri Language
Interpreter
User interface
(Windows, Presentation Manager, X-Windows, etc.)
Database
User interface (Application)
Application
Operating
System
IRS Information Resources Management Driver 4
1
2
3
Layers and Modules of Multi-Lingual Software
Fig. 3: Multilingual software levels (1-4) and modules
Level one essentially handles the presentation of the information, Level 2 converts
the information from one language to another, Level 3 manages the access
information and handles searches, and Level 4 manages the "documents" (datasets,
images, graphics, etc.) on optical storage media. This article will not go into Level 4,
the "IRS" Information Resources Management Program, in greater detail (compare
IMC Congress, Brussels
© Copyright PROJECT CONSULT GmbH 1993 Page 5 of 28
Multilingual Information and Retrieval Systems
Technology and Samples
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Kampffmeyer, Ulrich: "Combined WORM and Magneto-optical Mass Storage
Devices and Procedure-Oriented Information Processing Systems", GI Gesellschaft
für Informatik, Arbeitskreis "Datenbanken", Conference at the University of
Oldenburg, Germany, on Feb. 19, 1990). Levels 1-3 and their components will be
explained below.
2.2 User Interface and Application
The user interface depends to a large extent on the underlying operating system.
Many operating systems are not up to the demands of multilingual software, since
they do not allow for reconfiguration during runtime and do not support international
character sets and formats. Operating systems with graphic user interfaces, like
Microsoft Windows and OS/2 Presentation Manager, and operating systems based
on XWindows (OSF Motif, OpenLook, etc.) are suitable. These systems allow control
of the screen largely independent of the actual operating system itself.
There is a fundamental difference between
a) The standard Windows interface, and
b) The application-specific interface implemented on the basis of this interface. The
application-specific interface uses the tools provided by the standard interface to
represent the functions of the application.
A graphic user interface has numerous advantages: A lower learning curve,
integrated help functions, and simple operation by mouse, menus, or key
combinations. Another advantage of Windows interfaces is the unrestricted user-
sizing of windows and other displays.
It would be impracticable to give all displays of a multilingual application their own
user interface, since this would severely limit the number of compatible screen and
printer drivers. The application's user interface should use standard Windows
interface routines wherever possible.
The user interface (Windows as well as application) of a multilingual application
should include the following (see Fig. 4 and 5):
a) Change of key assignments for differing language keyboard layouts during
runtime by the application
b) Change of screen display during runtime by the application
c) Display of language-specific character sets (e.g. German: ä, ö, ü, ß; French: é, è,
ê, ç; Spanish: Í, ñ, ¿, ¡; Danish: å, æ; Hungarian: ÿ, ý, ï; Greek: a, b, c, etc.)
d) Change of formats, as for date, currency, time, etc. during runtime
e) Automatic adaptation of screens and fields to differing text lengths, special
symbols, fonts, etc., under the given monitor resolution
f) Language-specific context-sensitive help based on the cursor position, current
program status and the feasible or just completed action.
g) Modules loadable during runtime without leaving the program
IMC Congress, Brussels
Page 6 of 28 © Copyright PROJECT CONSULT GmbH 1993
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Multilingual Informations and Retrieval Systems
Technology and Samples
Operating System and User Interface
Requirements for European Software
The operating system and Window Interface must support several features
to enable switching the language during runtime
Change of keyboard setting and screen display during runtime via external program
Enhanced keyboard setting with special characters:
European languages ( ç, ê, æ, å, ø ,ä , etc. ).
Support and change during runtime of date and time formates
Graphic Interface with virtual Window architecture to allow different sizes
of screens and fields while changing the language
Context-sensitive help in relation to the actual position of the cursor
Fig. 4: Operating system and user interface
User Interface (application)
Requirements
The user interface has to support several functions to enable change of
language during runtime
Object oriented software
Change of screens, settings and styles during runtime
Dynamic positioning of fields
Automatic adaption of different field lengths
Controllable by the application program
Loadable modules during runtime for messages, windows and helptexts
Dynamic data and message interchange with operation system and user interface,
application program and database
Fig. 5: User interface
The most important feature of a multilingual application is convertibility during
runtime, without having to load and start another program and without changing the
screen and screen information content (Fig. 4).
IMC Congress, Brussels
© Copyright PROJECT CONSULT GmbH 1993 Page 7 of 28
Multilingual Information and Retrieval Systems
Technology and Samples
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
The text components are kept in separate files, called "resource libraries" (Fig. 2).
The resource libraries can be loaded on the fly by language selection variable Lx.
For language resources to be usable, all texts in a program that are going to be
displayed or printed must be referenced by an unambiguous key variable with the
appropriate library.
Resource libraries must exist for:
a) All static texts in dialogue boxes and masks. These are texts which are
associated with a given dialogue box and do not change.
b) Dynamic texts in dialogue boxes and masks. These are texts which change,
appear, or disappear according to status (messages). This includes the graying
out of inoperative or unavailable functions on menus and buttons.
c) Help texts which appear automatically or interactively.
d) Error messages, system messages and other operation-related messages.
Language Resources
Requirements
Language resources are used for displaying texts related to the unique keys in
the application
Loadable modules for each language
Every entry in the language resource is referenced by a unique key
which may be used by different applications and the database itself
Language resources are needed for
Every text on a entry or search screen form
Every message
Every helptext
Icons adapted for each country
Editor or tools for translation support
Fig. 6: Language resources
Many applications use icons and buttons to simplify option selection. If they bear text
or abbreviations (such as "B" for bold), these must be converted when the language
is changed (thus, in German "F" for "Fett" = bold). For this reason these icons and
buttons should likewise be kept in dedicated resource libraries instead of being
managed directly in the program. The same applies to icons with graphics, where
the graphics do not bring across the same meanings in a different language area or
country.
IMC Congress, Brussels
Page 8 of 28 © Copyright PROJECT CONSULT GmbH 1993
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Multilingual Informations and Retrieval Systems
Technology and Samples
Object-oriented programming languages and databases often support the use of
loadable resources, making them preferable to traditional programming tools.
The right choice of tools is important for the creation of applications based on a
programming language or database. The application is the superposed, integrative
component of the system as a whole (compare Figs. 2 and 7). The application
contains not only the usual data-processing algorithms and input/output modules,
but also the control and selection of language resources (transformation modules,
selection lists, thesauri, help texts, messages, screen layout and display, etc.).
Application
Characteristics
Object oriented message driven program
Direct control of database and user interface
Numeric keys for every text entry related to the screen display and database fields
Transformatters, selection lists, thesauri, language interpretors and
language resources as loadable modules
Database as loadable module or server-client-communication via SQL
Fig. 7: Components of the application
Object-oriented programs with a "message" concept, such as Microsoft Windows,
allow continuous control of the resources used and the condition of the screen.
Direct communication should be set up for control of the modules on level 2 (Fig. 3).
SQL can be used as a standardized interface for communication with the database
in which the actual information is kept and managed. All modules on levels 2, 3, and
4 (Fig. 3) should be directly accessible or loadable during runtime.
2.3 Transformation Modules
The numerical information in the database is stored in a format that can be
converted as needed for a given onloaded language resource. This conversion is
controlled by the variable "Lx" (Fig. 2). Transformation modules are considerably
easier to implement than text translators, since they work by exact rules and with
numeric values only (Fig. 8).
IMC Congress, Brussels
© Copyright PROJECT CONSULT GmbH 1993 Page 9 of 28
Multilingual Information and Retrieval Systems
Technology and Samples
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Transformation Modules
Types
Transformation modules are used for the display transfomation
of numeric values of the database
Transformation of date formates
Transformation of time formates
Transformation of addresses
Transformation of units of measure
Transformation of international standardized
nomenclature
Transformation of user-defined values
(supported by operating system)
(supported by operating system)
(position of postal codes, etc. )
(litre to gallon, km to mile, etc. )
(country and city names, etc. )
(see selection lists, etc. )
Fig. 8: Transformation modules
The most important standard transformation modules are:
a) Date formats
This module toggles the display format of dates between American (month-day-
year) and European (day-month-year). This function is often supported by the
operating system directly, and allows use of either the months' full names or their
abbreviations. The transformation module should be designed to cope with the
conversion of pre-2000 dates into the next century. This is important for all data
which must be retained for several years. The date transformer module must also
ensure the proper sorting during display.
b) Time formats
The same applies to time-display formats. For firms active on an international
scale, data is best stored in "Coordinated Universal Time" format (UTC).
Date and time transformation modules can be set up to check whether the
system's internal time setting is correct (the current date and time must always be
later than that of the last document to be saved; calibration with standard working
hours and days, etc., in order to be able to determine system down time if
necessary).
c) Address conversion
Address-format conversion affects printouts more than it does on-screen
displays. Addresses in Europe are not standardized, and use a variety of
sequences of street, house number, and postal code. This transformer module
recognizes the country of the addressee and selects the appropriate address
format for printouts.
IMC Congress, Brussels
Page 10 of 28 © Copyright PROJECT CONSULT GmbH 1993
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Multilingual Informations and Retrieval Systems
Technology and Samples
d) Units of measure
This is an important requirement for international trade and manufacturing
companies. For example, in the oil industry large quantities of different types of
oil and petroleum products are transported and handled daily. Measurement
values and with them customs and tax rates constantly fluctuate, depending on
the type of product and its specific weight and even on the ambient temperature.
In cross-border trade the units of measure as well as of currency must be
automatically converted.
The most important categories are units of currency, distance, weight, and
volume.
e) EDI data
Standardised electronic data interchange (EDI), such as EDIFACT, allows entire
business transactions to be handled electronically, without paper originals. The
data is archived digitally. For display and printouts, EDI codes are converted into
text. This conversion can be made language-specific through a language control
variable. With EDI data it is necessary to know what version of a given EDI
application the data will be converted with.
Further transformation modules can be added to cover other requirements for
specific industries and applications, for example converting product codes into text.
2.4 Selection Lists
Graphic interfaces like Microsoft Windows support single and multiple selection lists
(Fig. 9). With single selection lists only one item on a given list can be marked and
processed. With multiple selection lists, one or more items can be selected.
Selection Lists
Characteristics
Selection lists are an easy way to translate information
and to spare storage capacity
The list displays a text on the screen related to a database value
Every entry in a selection list refers to a value which is related to a database field
Every entry in the different language versions of a list refers to the same value
The database has to store only the numeric value of the entry
Selection lists help to standardize nomenclature in multinational and
multilingual organizations
Selection lists can be used as single and multiple-choice lists
Fig. 9: Selection Lists
IMC Congress, Brussels
© Copyright PROJECT CONSULT GmbH 1993 Page 11 of 28
Multilingual Information and Retrieval Systems
Technology and Samples
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Selection lists offer several advantages over regular text-entry fields in database
applications:
a) No typing mistakes
b) Selection lists keep the database uniform and ensure that entries can be easily
found again. Since the user must decide from among a set of given expressions,
entries are standardized.
c) The database stores only a reference number which refers to a text resource.
This keeps space requirements low, and different text resources can be
accessed depending on the language variable. Retrieval is faster, since the
system must search only through predefined numbers instead of text sequences.
d) Multiple selection lists facilitate the multiple allocation of a document and allow
the user to select a number of related items if he/she is unsure about the
allocation to a single one.
c) above is the most important factor for multilingual applications. The use of
reference numbers allows linkage to multiple lists in different languages. The
reference numbers can also be used to limit access, so that only cleared items
are shown in a search. Selection lists also facilitate data entry through the use of
presettings for recurring entries.
Selection lists can be created with standard text editors. However, this should be
done only by authorised persons, since changes to and especially deletions of
entries characteristics (entries in a selection list) can compromise the consistency of
the database. Strict update and maintenance rules are a must for distributed
systems and resources.
Selection lists with restricted vocabulary are the ideal medium for standardising
terminology within a company and for creating multilingual software systems.
Multilingual systems should avoid free text entry wherever possible and use
selection lists whenever feasible.
2.5 Thesauri
This term has widely differing meanings. In its original meaning it refers to a defined
specialist terminology, broken down hierarchically from the general down to the
precise. The terms differ clearly from one another and are distributed over several
hierarchical levels. A generic term at one level branches into a number of more
precise terms on the level below it. All terms at a given level should be at a similar
level of detail.
However, in many word-processing programs the "thesaurus" is simply a utility
showing possible synonyms. This familiar kind of thesaurus is completely unrelated
to the structured terminology system described above, as for example defined by the
International Standards Organization (ISO) for single- and multilanguage thesauri.
IMC Congress, Brussels
Page 12 of 28 © Copyright PROJECT CONSULT GmbH 1993
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Multilingual Informations and Retrieval Systems
Technology and Samples
Thesauri
Thesauri offer a hierachical structured and crosslinked nomenclature
One field on the screen may be represented by a structured hierarchical thesaurus
Similar to a selection list, the thesaurus displays a text related to a database value
related with this text
The thesaurus offers navigation and interpretation tools
The Thesaurus is a database of itself which relates numeric values to texts
and provides additional structure by hierarchic order and crosslinks
The structure of thesauri is standardized by ISO
The same thesaurus may be used by different applications
Fig. 10: Thesauri
Seen from the outside, the thesauri we are discussing here for multilingual systems
act similar to selection lists (Fig. 10, compare also Fig. 9). First a list of generic
terms is displayed (ISO Top Term; TT). Once a top term has been selected, the
more precise terms subordinated to it are shown on a second list (ISO Narrower
Term; NT). When one of these is selected it forms the new generic term (ISO
Broader Term; BT) for the next level of narrower terms (Fig, 11). This strict hierarchy
is fully applicable to only a few subject areas. Therefore, the ISO standard provides
for crosslinks. These link terms from different levels and branchings independently
of their position in the hierarchy. This is easier to follow on a program than it is to
describe in print.
An electronic thesaurus is referenced by numbers in the program just as is a
selection list (which see). However, unlike a selection-list entry, a thesaurus entry
includes not only a "unique identifier" number in the database, but also flags which
specify its display position (level and branching in the hierarchy) and the type (and if
necessary direction) of links. The links allow a term to be associated with more than
one top or broader terms in other branchings, as well as the linkage of a broader
term to several narrower terms in other branchings, regardless of the position in the
hierarchy. The use of different links (uni-directional, bidirectional, broad-to-narrow,
narrow-to-broad, additional reference, synonym, etc.) make it easier to navigate in
such a system. In principle the electronic thesaurus is an entire database
application, which stands between the user interface and the database proper.
The database proper stores only the unique identifier. If this is referenced with a
"narrower term", using its links and hierarchical position all associated broader terms
up to the top term can be found.
IMC Congress, Brussels
© Copyright PROJECT CONSULT GmbH 1993 Page 13 of 28
Multilingual Information and Retrieval Systems
Technology and Samples
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Thesauri
Hierarchy and Crosslinks
The Hierarchical View of the Thesaurus
(Top Term, Broader Term, Narrower Term)
The Network Structure of the Thesaurus
(Crosslinks independent of the hierarchical position)
7
6
5
4
3
21
7
6
8
4
3
21
1220
1210
1120
1110
1200
11001000
1220
1210
1120
1110
1200
11001000
Unique identifier Position in hierarchical view
Fig. 11: Hierarchy and virtual linkages (crosslinks)
An electronic thesaurus is represented internally as a network (relational system),
but to the outside as a hierarchy. Thus, the composition of a list of terms depends
not only on the broader term, but also on the links and the route taken to get to the
broader term. Unlike with a selection list, the lists displayed by an electronic
thesaurus can differ from situation to situation.
In addition to assisting in navigating by displaying the selection lists specific to a
broader term selected previously, a database-supported thesaurus can also be used
in "specialist" or "beginner" mode. When entering information, a specialist mode is
best which allows entry of a narrower term or an abbreviation directly, with the
system determining the associated broader terms without having to go through the
hierarchy. However, users who are inexperienced with hierarchical selection lists or
with the subject content of the thesaurus are better off doing their searches in
beginner mode, whereby the system analyses users' text input, looks for a match in
the thesaurus, and if in doubt shows a synonym list and help text suggesting a
repeat attempt or a more closely defined query. Such a "global search" can also be
done by further fields or other resources of the thesaurus.
Fig. 12 shows how a number of "slices" are assigned to the reference keys of the
thesaurus database. Each of the language slices contains all information on the
hierarchical and network structure of the terms, since this will differ from language to
language (narrower or broader terms, different semantic fields). However, regardless
of the differences among the languages the same information must be clearly
accessible in all. Therefore, the unique identifier is assigned not only the term itself
main keyword), but also acronyms (e.g. "NASA"), homonyms (words that sound the
IMC Congress, Brussels
Page 14 of 28 © Copyright PROJECT CONSULT GmbH 1993
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Multilingual Informations and Retrieval Systems
Technology and Samples
same but have different meanings), synonyms, plural forms, explanatory notes, etc.
This information is also accessed during a global search.
The "language slices" need not necessarily contain foreign languages; they can also
contain different aspects of a single language. This is particularly useful for
specialist languages. Thus, one slice can contain the regular colloquial language,
with only two or three levels and accessible to everyone, while another slice can
contain the terminology for a specialist field broken down into more levels and
accessible only to those working in that field. This allows control of the extent, depth,
and accessibility of information.
Thesauri
"Slice"- Model of a Multilingual Thesaurus
ID´s
dec
ID´s
dec
German Language "Slice"
ID´s
dec
ID´s of pre-
decessors
ID´s of suc-
cessors
position in
hierarchy
main key
wordn
synonyms, ho-
monyms etc.
ID´s
dec
ID´s of suc-
cessors
position in
hierarchy
main key
wordn
synonyms, ho-
monyms etc.
Unique
ID
A1
Unique
ID
A2
Unique
ID
A3
Unique
ID
An
...
ID´s of pre-
decessors
ID´s of suc-
cessors
position in
hierarchy
... ...
ID´s of pre-
decessors
main key
wordn
synonyms, ho-
monyms etc.
help text
help text
help text
French Language "Slice"
ID´s of suc-
cessors
position in
hierarchy
main key
wordn
synonyms, ho-
monyms etc.
help text
ID´s of pre-
decessors
help text
English Language "Slice"
help text
help text
ID´s of pre-
decessors
ID´s of suc-
cessors
position in
hierarchy
main key
wordn
synonyms, ho-
monyms etc.
ID´s of pre-
decessors
ID´s of suc-
cessors
position in
hierarchy
main key
wordn
synonyms, ho-
monyms etc.
ID´s of pre-
decessors
ID´s of suc-
cessors
position in
hierarchy
main key
wordn
synonyms, ho-
monyms etc.
help text
ID´s of pre-
decessors
ID´s of suc-
cessors
position in
hierarchy
main key
wordn
synonyms, ho-
monyms etc.
Fig. 12: Slice structure of a multilingual thesaurus
In addition to the modular slice structure, an electronic thesaurus database offers
many advantages:
a) A standardized, controlled vocabulary ensures unambiguous and complete
retrieval of all correctly entered information.
b) Entry errors are prevented.
c) Selection lists and help functions assist the user in finding his or her way through
extensive, many-layered specialist vocabularies.
d) Functions like "global search" enable searches to include synonyms, homonyms,
acronyms and other references as well as the help text itself.
e) The organization and structure of thesauri are internationally standardized.
f) A thesaurus database acts as a pre-processor, saving time in searches in the
database proper, since only short, unambiguous numerical references need be
IMC Congress, Brussels
© Copyright PROJECT CONSULT GmbH 1993 Page 15 of 28
Multilingual Information and Retrieval Systems
Technology and Samples
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
searched and evaluated. The thesaurus then converts the unique identifiers for
display.
g) Thesaurus databases can be run on a PC LAN, thus reducing the workload on
the central database and information resources management (IRS; see below
and Fig. 3).
If the system includes optical-systems management software in addition to the
thesaurus database and the database proper, it has a three-level database
hierarchy (compare Figs. 3 and 26):
a) Database for one or more thesauri (local or central)
b) Database for managing unique identifiers to selection lists and thesauri and for
managing database entries (numerical, alphanumeric, date, time, Boolean
variables, etc.)
c) Information retrieval and access system (RIAS). As a rule a non-standard
database for managing WORM (write-once) media, erasable, rewritable, and M/O
optical media, or read-only media (CD-ROM).
A standard database (preferably relational) can be used for the thesaurus database
as well as for the database proper. Full-text databases are not suitable for this type
of application (Fig. 13).
Database
Characteristics
Support of optical disk information retrieval system for mass data management
Standard relational database may be used to manage data
(except for language interpretation)
Standard fulltext database are not usable
Fig. 13: Database characteristics
2.6 Fulltext Translation
The electronic interpretation and translation of running text requires very different
strategies from those described up until now. Transformation modules, selection lists
and thesauri can be combined in a system as desired, since they all work by the
same rules: Numerical identifiers are transformed into predefined expressions in
defined ways.
A system capable of analysing running text is difficult to combine with these
modules. It is an independent and complex software system made up of many
component parts (Figs. 14 and 15).
IMC Congress, Brussels
Page 16 of 28 © Copyright PROJECT CONSULT GmbH 1993
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Multilingual Informations and Retrieval Systems
Technology and Samples
Language Interpreter
Characteristics
The language interpreter contains different modules which allow
translation and interpretation of fulltext databases.
Dictionaries provide information for the direct translation of nouns
(singular, plural, conjunctions, etc.)
Statistical modules support the interpretation of the noun inside a text
Linguistic modules support the interpretation of the grammatical context
Comparision modules combine the different strategies of interpretation
Presentation modules display the answer of a query in the chosen language
as translated fulltext
Inverted file and cache modules optimize access
Fig. 14: Components of a language translation system
Language Interpreter
Structure
Presentation
Modules
Linguistic
Modules
Statistic
Modules
Dictionaire
Modules
User Interface
Entry Query
Comparision
Inverted File
Database
Display
Language
Interpreter
Fig. 15: Structure of a language translation system
IMC Congress, Brussels
© Copyright PROJECT CONSULT GmbH 1993 Page 17 of 28
Multilingual Information and Retrieval Systems
Technology and Samples
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
a) Dictionaries contain the individual words in their different forms (plural, singular,
declined, conjugated, irregular verb forms, etc.). As a rule the dictionary will
constitute a database application of its own. However, it is completely different in
structure, makeup and content from the thesaurus discussed above.
b) Statistics modules analyse the occurrence and composition of words and
combinations of words.
c) Linguistic and grammatical-analysis modules are the most difficult part. They
must contain all the rules and comparative examples required to analyse syntax.
Pattern recognition and fuzzy logic techniques are often used for this purpose.
d) The results of a), b) and c) above are combined, evaluated and interpreted in a
comparison module. The comparison module is designed so that intermediate
results of one module can be returned to another module for evaluation. This
gives rise to an iterative process with a relatively high rate of recognition in texts
on specific subjects for which there are electronic dictionaries containing the
subject terminology.
e) Due to their architecture, traditional databases are not very effective at time-
consuming text analysis. To speed things up, special cache and inverted file
modules are often used as intermediaries.
f) Presentation modules handle the correct on-screen presentation of the translated
text. They work with information from the dictionary module, the evaluated text
from the database, and the inverted file system.
The running text interpretation system we have described can be used to evaluate
queries in regular text. Fig. 15 shows the processing path for a query. The system
goes through the modules from bottom to top in the same way to convert a text out of
the database. The system shown here is just one possible configuration. Since this
technology is very new, many other approaches are being investigated. This
particular approach has the advantage that different modules with differing
evaluation strategies can be consulted simultaneously. Furthermore, each module
can be dedicated to certain languages or vocabularies, and accessed automatically
by the comparison module as needed. The interpretation and translation of a text is
very time-consuming, and usually possible only on very fast dialogue computers.
Complex systems such as the one described should not be confused with simple
translation aids.
Traditional full-text databases are seldom suitable for such systems. Standard
database software uses a strategy of leaving out filler words, adjectives, adverbs,
etc. in order to save memory space and increase database speed. However, a
language interpretation system needs all of the information contained in the text,
since otherwise coherent, context-adequate translation is not possible.
"Language Interpreter" database systems have enjoyed initial successes with the
UNO and the European Commission.
The choice of a system for multilingual database applications is still simple at this
point:
a) For document-oriented (facsimile) systems, applications with controlled
vocabularies, and systems intended to bring about a standardization of use, the
transformation, selection list and thesaurus approach is the right choice.
IMC Congress, Brussels
Page 18 of 28 © Copyright PROJECT CONSULT GmbH 1993
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Multilingual Informations and Retrieval Systems
Technology and Samples
b) For full-text applications which will not go into full use within the next three to four
years, the approach described in this section should be attempted or at least
examined.
At present there is no commercial software immediately available for either
application, nor are off-the-shelf solutions likely to become available in the future,
since the nature of the application and the vocabulary will be subject to constant
change.
However, in my opinion an approach as shown in Fig. 3 is ideal. It combines the
different transformation and interpretation components in one level where they work
in parallel. They link the user interface with the database proper. This integrative
approach combines the advantages of all of the techniques named, which can then
be used individually or in combination as needed.
3. Sample Applications
We will now look at multilingual information and retrieval systems from the user's
point of view, using three examples.
Application Examples
HYPARCHIV Standard optical filing software
for Microsoft Windows in 9 languages
wf Base Distributed press and commercial information system
in 4 languages based on ISDN-Knots
(wf, Switzerland)
HEMIS Meta-database and information system for
environmental data;
Informations, programmes, methods, etc.
for CD-ROM-distribution
(UNEP/HEM, worldwide)
Fig. 16: Application examples
a) wfBase Press and economics information in a distributed document-
imaging system
b) HEMIS Environmental information on CD-ROM
3.1 wfBase
IMC Congress, Brussels
© Copyright PROJECT CONSULT GmbH 1993 Page 19 of 28
Multilingual Information and Retrieval Systems
Technology and Samples
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
wfBase was developed specially for the Swiss Institute for Commercial Development
(German "Wirtschaftsförderung", hence "wf"). It has been in operational use since
1992.
The Swiss Institute for Commercial Development is located in Zürich, with offices in
Geneva, Bern and Lugano. Prior to the introduction of wfBase, dossiers on political
events, economic data, and the like were kept independently at all four locations.
The goal of wfBase is to enable access by all Institute users to all press articles,
periodicals, and Institute documents, independent of the language of data entry
(Figs. 17 and 18).
wfBase
wf Schweitzer Wirtschaftsförderung
Swiss Institute for Commercial Development
Zürich - Geneva - Bern - Lugano
The wf owns one of the largest archives on commercial and political topics in
Switzerland. It provides information to politicians, journalists and its commercial
members representing all major companies of Switzerland.
Optical filing system for press and commercial documents
(scanned and created via word processor, sreadsheet, etc. )
Distributed system linked via SwissNet 2 (ISDN)
Access for wf-employees and third-party partners via multilingual graphic
user interface (ISDN and telephone modem)
Database with 4-lingual thesaurus
Access to information independent of the language in which it was entered
Several million documents stored on M/O-Jukeboxes (2 times 50 gigabyte)
Integrated bureau communication with textprocessing, spreadsheet, FAX,
library management, electronic mail, accounting, address database, etc.
Fig. 17: wfBase - Features
IMC Congress, Brussels
Page 20 of 28 © Copyright PROJECT CONSULT GmbH 1993
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Multilingual Informations and Retrieval Systems
Technology and Samples
wfBase
Storage and Communications Layout
Harddisk Cache
Zürich
Images, Files & Descriptors
Read / Write / Create
Harddisk Cache
Lugano
Images, Files & Descriptors
Read / Create
Harddisk Cache
Geneva
Images, Files & Descriptors
Read / Create
Harddisk Cache
Bern
Images, Files & Descriptors
Read / Create
wf-User
wf-User
wf-User
wf-User
Jukebox
External Use
Jukebox
Internal Use
DB Server
Zürich
Archive - Server
Zürich
Communications -
Server
Zürich
External User
Harddisk Cache
Addresses
Library
Dossiers
ISDN
Telephone
Modem&
ISDN
SwissNet 2
ISDN
SwissNet 2
ISDN
SwissNet 2
Novell
Netware
Fig. 18: wfBase - System configuration with internal and external users and information
management in two jukeboxes (Zürich)
IMC Congress, Brussels
© Copyright PROJECT CONSULT GmbH 1993 Page 21 of 28
Multilingual Information and Retrieval Systems
Technology and Samples
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
wfBase also integrates other applications besides document management under its
graphical user interface, such as word processing and spreadsheet applications,
address and library management, billing for outside users, electronic faxing and
mailboxes, etc. The wfBase system makes use of some HYPARCHIV modules, but is
otherwise an independent application with client-server architecture and a relational
database on an OS/2 server. The MS Windows workplaces are linked together in a
Novell network. Outside users can access wfBase by modem, query documents
("subsets"), and display and print them locally or have wfBase fax the documents to
them.
The four wfBase locations are linked by SwissNet2 (ISDN). This powerful network
allows compressed scanned facsimile transmission. Two jukeboxes store scanned
facsimiles, locally-generated data, and incoming faxes. The system is highly error-
tolerant and largely fail-safe.
At the heart of wfBase is the database with a quadrilingual (German, French, Italian,
English) thesaurus for subject-area classification. The thesaurus includes over 2000
subject areas, organized hierarchically and in linked structure over four levels.
wfBase
Multilingual Thesaurus
The two images show different views of the thesaurus for thematic
keywords (here in German). The thesaurus supports the user in navigation,
jump-functions, short-key-entries, synonym-retrieval and other techniques for
easy-to-use access.
Screen I
aus Vortrag Online ´92
Thesaurus-Maske
Sachgebiet
für Vortrag auf Folie
einkleben
Screen II
aus Vortrag Online ´92
Thesaurus-Maske
Sachgebiet
für Vortrag auf Folie
einkleben
Fig. 19: wfBase multilingual thesaurus, showing two windows of the thesaurus screen. The left
shows the branching from a broad term to a list of narrower terms. The thesaurus contains
the subject areas covered in the dossiers.
IMC Congress, Brussels
Page 22 of 28 © Copyright PROJECT CONSULT GmbH 1993
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Multilingual Informations and Retrieval Systems
Technology and Samples
In addition to the thesaurus, there are selection lists for other fields and fields for text
and data entry. The database enables the user to locate documents regardless of
the language in which they were entered. However, the system displays documents
only in their language of origin; in a multilingual country like Switzerland it is not
necessary to translate the contents of documents, as users are expected to be
multilingual as a matter of course. Instead, the objective of wfBase is to improve
communication between office locations, standardize addresses and documentation,
eliminate redundancies, and provide third parties (members of the wf's supporting
organizations) with a simple, time-saving and cost-effective means of access.
3.2 HEMIS
Within the United Nations Environmental Programme, or UNEP, there is an
organization called UNEP/HEM (Harmonization of Environmental Measurement)
which is responsible for the harmonisation of environmental monitoring methods,
plans, projects and information. Since 1990 a project has been underway at the
Munich UNEP/HEM office to immplement an information and meta-database system
for the UNEP/HEM, called HEMIS (= HEM Information System). HEMIS is intended
to provide an overview of:
a) Current global and national environmental projects by the UN and other
international and world organizations
b) Institutions, research emphases, periodicals, and key personnel
c) Methodology, reference materials, etc.
d) Databases, data formats, data quality, access, etc.
The information contained in HEMIS is meta-data compiled from widely varying
sources (Figs. 17 and 24).
HEMIS
UNITED NATIONS ENVIRONMENTAL PROGRAMME
HARMONIZATION OF ENVIRONMENTAL MEASUREMENT
UNEP / HEM, Nairobi / Munich
The UNEP / HEM Office harmonizes nomenclature, measurements and other
information used worldwide in environmental projects. This task will be
supported in the future by the HEMIS meta-database and information system,
a multilingual CD-ROM using PC-system.
Multilingual thesauri for scientific nomenclature, countries, climates, etc.
with references, links, synonyms, homonyms, acronyms and wildcard-functionalitity
Harmonization of nomenclature by standardized access to Information
Hyperlinks, guided tours, global search facilities together with the thesaurus
enable easy access to the Information independent of the language of entry
CD-ROM based worldwide distribution
Fig. 20: HEMIS - Information and meta-database system of the UN environmental organization
UNEP/HEM
IMC Congress, Brussels
© Copyright PROJECT CONSULT GmbH 1993 Page 23 of 28
Multilingual Information and Retrieval Systems
Technology and Samples
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
The goal is to harmonize access to heterogeneous information of varying quality and
extent from varying sources.
HEMIS is made up of two component systems:
a) One system will be installed in Munich with which all information can be
collected, processed, the contents made readily accessible, and managed. The
system is intended to be able to create reports (printouts) selected from its
database and to create CD-ROM databases.
b) The other will handle worldwide distribution of extracts from HEMIS in Munich by
CD-ROM in regularly updated editions.
The two component systems will have differing user interfaces, databases, etc.1
System a) is a production system that will generally be used only by UNEP/HEM
employees. System b) is designed to provide information internationally on
environmental projects, prevent parallel developments, and supply basic project and
database data, even if the information is not available in the user's own language.
The HEMIS CD-ROM will be made as attractive as possible so that it is widely used,
and so that other institutions not associated with the UN will be motivated to supply
data for the system (Fig. 21).
INFOTERRA
H E M I S
Methods/
Models
Classification
Systems
Data-
bases
Programmes
Institutions
Persons
H i g h L e v e l D a t a M o d e l
ESA
EEA-TF
WMO
GEMS
IAEA Others
Governments
NGOs
UN
UNEP
EARTHWATCH

Examples of
sectoral / regional / specialized
sources of
environmental meta-data
Users


Harmonization and Distribution of Information via HEMIS
Fig. 21: Information harmonisation and and distribution by HEMIS. Data on paper, diskette and CD
is read into the stationary HEMIS, selected and formatted, classified semi-automatically or
manually following a defined nomenclature (thesauri), and finally distributed in the form of
printed reports on specific subjects or on CD-ROM. This figure shows only a representative
sample of the participating organizations.
1 At this writing (late 1992) HEMIS is still at the design and prototype stage. Not all components have
been implemented as yet.
IMC Congress, Brussels
Page 24 of 28 © Copyright PROJECT CONSULT GmbH 1993
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Multilingual Informations and Retrieval Systems
Technology and Samples
The major components of both the stationary and the CD-ROM HEMIS systems are
a number of electronic thesauri, structured as shown in Figs. 11, 12, and 22.
The Internal Structure of the HEMIS Thesaurus
A B C D E F G
Unique
Identifier
(ID)
IDs of prede-
cessors
(ISO TT, BT,
links)
IDs of followers
(ISO NT, links)
Main descriptor
for display in the
hierarchy of the
thesaurus
Position of “D”
in the hierarchy
Synonyms,
acronyms,
homonyms,
interpretations,
etc. of “D”
Explanation
Numeric
One entry
8 digits
Numeric
Up to 64 entries
8 digits
Sequence of
digits
Numeric
Up to 255 entries
8 digits
Sequence of
digits
Alpha- numeric
One entry
Up to 20
characters
(due to display
restrictions)
Numeric
One entry
Up to 8 digits
(max. of 8
hierarchy
levels)
Numeric
Up to 255 entries
Up to 40
characters each
sequence of
texts
Alpha- numeric
One entry
Up to 255
characters
Unique
reference key
for the
descriptor
database
Retrievable via
hierarchical
selection list
and
global search
Internal
management
Bi-
directional
Internal
mangement
Uni-
directional
For screen
display
in the
hierarchical
thesaurus
only
Retrievable
via
global search
Available
as context -
sensitive
help function
Fig. 22: Structure of the HEMIS thesaurus for geographical units, climate zones, subject areas, and
other hierarchically structured reference keys. For an explanation of the entries in the first
row see Section 2.5 and Fig. 12.
The thesauri and selection lists are part of both HEMIS systems. In the stationary
system they are used in making key words for data sets, documents, graphics,
images etc., and for searching and compiling data. If information is supplied on
computer media in pre-agreed formats, some of the key-word creation process can
be done by the system automatically. In the CD-ROM version the thesauri, selection
lists and all other entries are used only for researching and compiling information.
The HEMIS CD-ROM version has a multi-layer modular structure (see Fig. 23).
IMC Congress, Brussels
© Copyright PROJECT CONSULT GmbH 1993 Page 25 of 28
Multilingual Information and Retrieval Systems
Technology and Samples
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
HEMIS-System Layout with Multi-Lingual User-interfaces
User Interface - (i.e. English)
Descriptor database
(field oriented database)
Hyper-
links
(part of
the
stored
objects)
Information retrieval and access system (IRAS)
Objects
Texts Images Datasets
Additional user interface in different languages
Standard variables
(alphanumeric,
numeric etc)
Numeric keys
related to thesauri and
selection lists
Database
of
guided
tour
links
Query by
example
Global
search
Thesauri
Selection lists
Guided
tours
Links language
translator
Fig. 23: HEMIS system layout with multilingual user guidance and search. The user interfaces in the
various languages make up the first layer. The next layer is composed of modules for
different search and navigation strategies, likewise language-specific. In addition to a
database, HEMIS has prearranged "guided tours" and "links". The information and
documents on the CD-ROM are managed by an Information Retrieval and Access System
(IRAS).
In addition to searching for certain key words or terms, HEMIS also offers navigation
assistance in the form of prearranged "guided tours" and individual links. A global
database search takes a certain amount of time, but it does allow the user to use the
system without prior knowledge of what contents lie behind a given field in the
search mask. The user interface can be toggled among different loadable
languages, as can the thesauri, selection lists, links and guided tours. Free text input
and scanned-in documents are not translated. HEMIS is intended to provide the
initial information; the user can then consult the source institutions, databases, or
publications for more in-depth information.
Fig. 24 shows the proposed starting screen of the HEMIS prototype with the button
fields for moving to the main subject-area screens.
IMC Congress, Brussels
Page 26 of 28 © Copyright PROJECT CONSULT GmbH 1993
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
Multilingual Informations and Retrieval Systems
Technology and Samples
Start Screen of the HEMIS-Prototype
H E M I S
Environmental Information Sys temInstitutions
Programmes
Ref. Material
Methods
Databas es
Ins titutions RegionDatabas es Methods Ref. Mat. LocationThes aurusG uided TourProgrammes
G uided Tours
Subject Thesaurus
Help
Region
Location
FrancaisEnglis h Deutsch
Choos e Wählen SieChois ir
EXIT
?
Fig. 24: HEMIS starting screen (suggested CD-ROM version)
4. Outlook and Summary
The development of multilingual information and retrieval systems has only just
begun.
Conclusions
MultiLingual Information and Retrieval Software
The European Challenge for 1993
Multi-lingual software is a must for all companies and organizations working
in different European Coutries
The American software industry is presently unable to supply multilingual software -
This is a window of opportunity for European software companies
Multilingual software helps to bridge the national barriers within Europe
Multilingual software is intelligent object-oriented programming
using databases and information management systems as a framework
for huge masses of coded and non-coded information
Fig. 25: Summary of the most important arguments for multilingual software
IMC Congress, Brussels
© Copyright PROJECT CONSULT GmbH 1993 Page 27 of 28
Multilingual Information and Retrieval Systems
Technology and Samples
Dr. Ulrich Kampffmeyer
PROJECT CONSULT GmbH
In this article, the following arguments have been advanced (Fig. 25):
a) Multilingual software is a necessity for all organizations with Europe-wide or
world-wide activities, for which a single "company language" is undesirable or
impracticable.
b) Multilingual software is available in its basic features as standard software, but
as a rule it must be modified for the specific application before it can be used to
full benefit (compare wfBase, 3.2, and HEMIS, 3.3)
c) Multilingual retrieval software can be used for accessing large quantities of data
or documents on digital optical storage media.
d) Multilingual thesauri encourage standardization in document classification,
enable clear and structured access to documnets, and support searches for
documents not in the user's own language.
e) Multilingual fulltext retrieval and translating systems are in use in prototype form.
Combined with other techniques, such as thesauri, they will make easy-to-use
information systems feasible in the future.
f) Multilingual software is a market opportunity for European software and systems
firms.
g) Multilingual retrieval and information systems can be used to advantage in
almost all areas of business and administration which extend beyond national
and cultural boundaries.
IMC Congress, Brussels
Page 28 of 28 © Copyright PROJECT CONSULT GmbH 1993

More Related Content

Similar to [EN] "Multilingual Information and Retrieval Systems Technology and Applications" | Dr. Ulrich Kampffmeyer | IMC Congress 1993 | Brussels

Computer Software and It's Development
Computer Software and It's DevelopmentComputer Software and It's Development
Computer Software and It's DevelopmentRabin BK
 
Computer software is defined .docx
Computer software is defined       .docxComputer software is defined       .docx
Computer software is defined .docxKamran Abdullah
 
ICT, Importance of programming and programming languages
ICT, Importance of programming and programming languagesICT, Importance of programming and programming languages
ICT, Importance of programming and programming languagesEbin Robinson
 
Knowledge management tools
Knowledge management toolsKnowledge management tools
Knowledge management toolsmohsen seyedi
 
Ontological approach to the specification of properties of software systems a...
Ontological approach to the specification of properties of software systems a...Ontological approach to the specification of properties of software systems a...
Ontological approach to the specification of properties of software systems a...Patricia Tavares Boralli
 
Presentation on computer softwares
Presentation on computer softwaresPresentation on computer softwares
Presentation on computer softwaresinderbipasha
 
Technology Infrastructure For The Pervasive Vision, Does It Exist Yet?
Technology Infrastructure For The Pervasive Vision, Does It Exist Yet?Technology Infrastructure For The Pervasive Vision, Does It Exist Yet?
Technology Infrastructure For The Pervasive Vision, Does It Exist Yet?Olivia Moran
 
ML Tutorial Introduction
ML Tutorial IntroductionML Tutorial Introduction
ML Tutorial Introductionelbop
 
software development and programming languages
software development and programming languages software development and programming languages
software development and programming languages PraShant Kumar
 
Tim willoughby open source-in-local-government
Tim willoughby open source-in-local-governmentTim willoughby open source-in-local-government
Tim willoughby open source-in-local-governmentOpenSourceLGMA
 
Application of computers
Application of computersApplication of computers
Application of computersDashvina
 

Similar to [EN] "Multilingual Information and Retrieval Systems Technology and Applications" | Dr. Ulrich Kampffmeyer | IMC Congress 1993 | Brussels (20)

Computer Software and It's Development
Computer Software and It's DevelopmentComputer Software and It's Development
Computer Software and It's Development
 
Computer software is defined .docx
Computer software is defined       .docxComputer software is defined       .docx
Computer software is defined .docx
 
ICT, Importance of programming and programming languages
ICT, Importance of programming and programming languagesICT, Importance of programming and programming languages
ICT, Importance of programming and programming languages
 
Knowledge management tools
Knowledge management toolsKnowledge management tools
Knowledge management tools
 
Unit ii oo design 9
Unit ii oo design 9Unit ii oo design 9
Unit ii oo design 9
 
Bis Chapter4
Bis Chapter4Bis Chapter4
Bis Chapter4
 
Ontological approach to the specification of properties of software systems a...
Ontological approach to the specification of properties of software systems a...Ontological approach to the specification of properties of software systems a...
Ontological approach to the specification of properties of software systems a...
 
Presentation on computer softwares
Presentation on computer softwaresPresentation on computer softwares
Presentation on computer softwares
 
End User Computing
End User ComputingEnd User Computing
End User Computing
 
DEVELOPMENT OF A SOFTWARE MAINTENANCE COST ESTIMATION MODEL: 4 TH GL PERSPECTIVE
DEVELOPMENT OF A SOFTWARE MAINTENANCE COST ESTIMATION MODEL: 4 TH GL PERSPECTIVEDEVELOPMENT OF A SOFTWARE MAINTENANCE COST ESTIMATION MODEL: 4 TH GL PERSPECTIVE
DEVELOPMENT OF A SOFTWARE MAINTENANCE COST ESTIMATION MODEL: 4 TH GL PERSPECTIVE
 
DEVELOPMENT OF A SOFTWARE MAINTENANCE COST ESTIMATION MODEL: 4 TH GL PERSPECTIVE
DEVELOPMENT OF A SOFTWARE MAINTENANCE COST ESTIMATION MODEL: 4 TH GL PERSPECTIVEDEVELOPMENT OF A SOFTWARE MAINTENANCE COST ESTIMATION MODEL: 4 TH GL PERSPECTIVE
DEVELOPMENT OF A SOFTWARE MAINTENANCE COST ESTIMATION MODEL: 4 TH GL PERSPECTIVE
 
Chapter 1
Chapter 1Chapter 1
Chapter 1
 
Information systems software
Information systems softwareInformation systems software
Information systems software
 
Technology Infrastructure For The Pervasive Vision, Does It Exist Yet?
Technology Infrastructure For The Pervasive Vision, Does It Exist Yet?Technology Infrastructure For The Pervasive Vision, Does It Exist Yet?
Technology Infrastructure For The Pervasive Vision, Does It Exist Yet?
 
ML Tutorial Introduction
ML Tutorial IntroductionML Tutorial Introduction
ML Tutorial Introduction
 
software development and programming languages
software development and programming languages software development and programming languages
software development and programming languages
 
Xml And Ecm
Xml And EcmXml And Ecm
Xml And Ecm
 
Tim willoughby open source-in-local-government
Tim willoughby open source-in-local-governmentTim willoughby open source-in-local-government
Tim willoughby open source-in-local-government
 
FLOSS in SMEs
FLOSS in SMEsFLOSS in SMEs
FLOSS in SMEs
 
Application of computers
Application of computersApplication of computers
Application of computers
 

More from PROJECT CONSULT Unternehmensberatung Dr. Ulrich Kampffmeyer GmbH

More from PROJECT CONSULT Unternehmensberatung Dr. Ulrich Kampffmeyer GmbH (20)

[DE] Herausforderung Information Governance | Webinar "Mit ECM zu effektiver ...
[DE] Herausforderung Information Governance | Webinar "Mit ECM zu effektiver ...[DE] Herausforderung Information Governance | Webinar "Mit ECM zu effektiver ...
[DE] Herausforderung Information Governance | Webinar "Mit ECM zu effektiver ...
 
[DE] Die 10 PROJECT CONSULT Trends für das Information Management 2021 | Webc...
[DE] Die 10 PROJECT CONSULT Trends für das Information Management 2021 | Webc...[DE] Die 10 PROJECT CONSULT Trends für das Information Management 2021 | Webc...
[DE] Die 10 PROJECT CONSULT Trends für das Information Management 2021 | Webc...
 
[DE] Knowledge Management, eBusiness & Enterprise Content Management - Neue H...
[DE] Knowledge Management, eBusiness & Enterprise Content Management - Neue H...[DE] Knowledge Management, eBusiness & Enterprise Content Management - Neue H...
[DE] Knowledge Management, eBusiness & Enterprise Content Management - Neue H...
 
[DE] Staffware Process Suite – Delivering the Process of Business | Dr. Ulric...
[DE] Staffware Process Suite – Delivering the Process of Business | Dr. Ulric...[DE] Staffware Process Suite – Delivering the Process of Business | Dr. Ulric...
[DE] Staffware Process Suite – Delivering the Process of Business | Dr. Ulric...
 
[EN] Decisions and Timing, a Practical Guide | Dr. Ulrich Kampffmeyer | PROJE...
[EN] Decisions and Timing, a Practical Guide | Dr. Ulrich Kampffmeyer | PROJE...[EN] Decisions and Timing, a Practical Guide | Dr. Ulrich Kampffmeyer | PROJE...
[EN] Decisions and Timing, a Practical Guide | Dr. Ulrich Kampffmeyer | PROJE...
 
[DE] Planung und Projektmanagement zur Einführung komplexer Informationssyste...
[DE] Planung und Projektmanagement zur Einführung komplexer Informationssyste...[DE] Planung und Projektmanagement zur Einführung komplexer Informationssyste...
[DE] Planung und Projektmanagement zur Einführung komplexer Informationssyste...
 
[DE] Zukünftige Entwicklungen im Information Management 2020 bis 2026 | Webca...
[DE] Zukünftige Entwicklungen im Information Management 2020 bis 2026 | Webca...[DE] Zukünftige Entwicklungen im Information Management 2020 bis 2026 | Webca...
[DE] Zukünftige Entwicklungen im Information Management 2020 bis 2026 | Webca...
 
[DE] Records Management | Dr. Ulrich Kampffmeyer | Saperion User Group | Berl...
[DE] Records Management | Dr. Ulrich Kampffmeyer | Saperion User Group | Berl...[DE] Records Management | Dr. Ulrich Kampffmeyer | Saperion User Group | Berl...
[DE] Records Management | Dr. Ulrich Kampffmeyer | Saperion User Group | Berl...
 
[DE] Workflow vom mainframe ins internet | Dr. Ulrich Kampffmeyer | Safe Tagu...
[DE] Workflow vom mainframe ins internet | Dr. Ulrich Kampffmeyer | Safe Tagu...[DE] Workflow vom mainframe ins internet | Dr. Ulrich Kampffmeyer | Safe Tagu...
[DE] Workflow vom mainframe ins internet | Dr. Ulrich Kampffmeyer | Safe Tagu...
 
[DE] „Aktuelles zu den rechtlichen Anforderungen an die Archivierung und Verf...
[DE] „Aktuelles zu den rechtlichen Anforderungen an die Archivierung und Verf...[DE] „Aktuelles zu den rechtlichen Anforderungen an die Archivierung und Verf...
[DE] „Aktuelles zu den rechtlichen Anforderungen an die Archivierung und Verf...
 
[DE] ECM 2.0 - die Zukunft dokumentbezogener Technologien | Ulrich Kampffmeye...
[DE] ECM 2.0 - die Zukunft dokumentbezogener Technologien | Ulrich Kampffmeye...[DE] ECM 2.0 - die Zukunft dokumentbezogener Technologien | Ulrich Kampffmeye...
[DE] ECM 2.0 - die Zukunft dokumentbezogener Technologien | Ulrich Kampffmeye...
 
[DE] ECM 2.0 - die Zukunft dokumentbezogener Technologien | Ulrich Kampffmeye...
[DE] ECM 2.0 - die Zukunft dokumentbezogener Technologien | Ulrich Kampffmeye...[DE] ECM 2.0 - die Zukunft dokumentbezogener Technologien | Ulrich Kampffmeye...
[DE] ECM 2.0 - die Zukunft dokumentbezogener Technologien | Ulrich Kampffmeye...
 
[DE] Digitalisierung & Information Management | Dr. Ulrich Kampffmeyer | Ceni...
[DE] Digitalisierung & Information Management | Dr. Ulrich Kampffmeyer | Ceni...[DE] Digitalisierung & Information Management | Dr. Ulrich Kampffmeyer | Ceni...
[DE] Digitalisierung & Information Management | Dr. Ulrich Kampffmeyer | Ceni...
 
[DE] Die 10 PROJECT CONSULT Trends für das Information Management 2020 | Web...
[DE]  Die 10 PROJECT CONSULT Trends für das Information Management 2020 | Web...[DE]  Die 10 PROJECT CONSULT Trends für das Information Management 2020 | Web...
[DE] Die 10 PROJECT CONSULT Trends für das Information Management 2020 | Web...
 
[DE] Sichere elektronische Archivierung | Felix von Bredow | AIIM Conference ...
[DE] Sichere elektronische Archivierung | Felix von Bredow | AIIM Conference ...[DE] Sichere elektronische Archivierung | Felix von Bredow | AIIM Conference ...
[DE] Sichere elektronische Archivierung | Felix von Bredow | AIIM Conference ...
 
[DE] Part 2 | E-Mail-Archivierung richtig gestalten – Rechtliche Grundlagen,...
[DE] Part 2 |  E-Mail-Archivierung richtig gestalten – Rechtliche Grundlagen,...[DE] Part 2 |  E-Mail-Archivierung richtig gestalten – Rechtliche Grundlagen,...
[DE] Part 2 | E-Mail-Archivierung richtig gestalten – Rechtliche Grundlagen,...
 
[DE] Part 1 | E-Mail-Archivierung richtig gestalten – Rechtliche Grundlagen,...
[DE] Part 1 |  E-Mail-Archivierung richtig gestalten – Rechtliche Grundlagen,...[DE] Part 1 |  E-Mail-Archivierung richtig gestalten – Rechtliche Grundlagen,...
[DE] Part 1 | E-Mail-Archivierung richtig gestalten – Rechtliche Grundlagen,...
 
[DE] "Effizienter Einsatz von Dokumenten-Technologie" | Dr. Ulrich Kampffmeye...
[DE] "Effizienter Einsatz von Dokumenten-Technologie" | Dr. Ulrich Kampffmeye...[DE] "Effizienter Einsatz von Dokumenten-Technologie" | Dr. Ulrich Kampffmeye...
[DE] "Effizienter Einsatz von Dokumenten-Technologie" | Dr. Ulrich Kampffmeye...
 
[DE] Optische Speicher, Voraussetzung für die Informationsverarbeitung der Zu...
[DE] Optische Speicher, Voraussetzung für die Informationsverarbeitung der Zu...[DE] Optische Speicher, Voraussetzung für die Informationsverarbeitung der Zu...
[DE] Optische Speicher, Voraussetzung für die Informationsverarbeitung der Zu...
 
[DE] Vom Wert der Information | Dr. Ulrich Kampffmeyer | DMS EXPO 2014
[DE] Vom Wert der Information | Dr. Ulrich Kampffmeyer | DMS EXPO 2014[DE] Vom Wert der Information | Dr. Ulrich Kampffmeyer | DMS EXPO 2014
[DE] Vom Wert der Information | Dr. Ulrich Kampffmeyer | DMS EXPO 2014
 

Recently uploaded

Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaoncallgirls2057
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 
MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?Olivia Kresic
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckHajeJanKamps
 
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadIslamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadAyesha Khan
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy Verified Accounts
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCRashishs7044
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCRashishs7044
 
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menzaictsugar
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfRbc Rbcua
 
Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Seta Wicaksana
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Kirill Klimov
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCRashishs7044
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...lizamodels9
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Pereraictsugar
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Servicecallgirls2057
 
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...lizamodels9
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesKeppelCorporation
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyotictsugar
 
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607dollysharma2066
 

Recently uploaded (20)

Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City GurgaonCall Us 📲8800102216📞 Call Girls In DLF City Gurgaon
Call Us 📲8800102216📞 Call Girls In DLF City Gurgaon
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 
MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?MAHA Global and IPR: Do Actions Speak Louder Than Words?
MAHA Global and IPR: Do Actions Speak Louder Than Words?
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
 
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadIslamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail Accounts
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR
 
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
 
APRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdfAPRIL2024_UKRAINE_xml_0000000000000 .pdf
APRIL2024_UKRAINE_xml_0000000000000 .pdf
 
Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...Ten Organizational Design Models to align structure and operations to busines...
Ten Organizational Design Models to align structure and operations to busines...
 
Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024Flow Your Strategy at Flight Levels Day 2024
Flow Your Strategy at Flight Levels Day 2024
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Perera
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
 
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation Slides
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyot
 
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607FULL ENJOY Call girls in Paharganj Delhi | 8377087607
FULL ENJOY Call girls in Paharganj Delhi | 8377087607
 

[EN] "Multilingual Information and Retrieval Systems Technology and Applications" | Dr. Ulrich Kampffmeyer | IMC Congress 1993 | Brussels

  • 1. Multilingual Information and Retrieval Systems Technology and Applications Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Multilingual Information and Retrieval Systems Technology and Applications IMC Congress, Brussels 1993 Dr. Ulrich Kampffmeyer • VOI Verband Optische Informationssysteme, Roßdorf / Darmstadt German Association of Manufacturers and Resellers of Digital Optical Media, Systems and Software (Chairman of the Board) • PROJECT CONSULT Unternehmensberatung Dr. Ulrich Kampffmeyer GmbH Wachenheim, Hamburg, Darmstadt Abstract This paper on multilingual information and retrieval systems with optical mass storage describes the technical principles of software design. The different layers and modules from the user interface via transformation modules, thesaurus modules and fulltext interpretation to database management are explained in detail. Two examples of multilingual document imaging systems are presented: - wfBase multilingual press and commerce information system base on four ISDN-knots in Switzerland; - HEMIS multilingual information system for CD-ROM distribution on environmental institutions, projects and programmes of the UN Environmental Programme UNEP/HEM. Contents Page 1. The Importance of Multilingual Software Systems With Optical Storage Media for the European Economic Region 2 2. Software Design 3 2.1 Structural and Other Requirements for Multilingual Software ................................................................................................................. 3 2.2 User Interface and Application ................................................................................................................. 6 2.3 Transformation Modules ................................................................................................................. 9 2.4 Selection Lists ................................................................................................................. 11 IMC Congress, Brussels © Copyright PROJECT CONSULT GmbH 1993 Page 1 of 28
  • 2. Multilingual Information and Retrieval Systems Technology and Samples Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH 2.5 Thesauri ................................................................................................................. 12 2.6 Fulltext Translation ................................................................................................................. 16 3. Sample Applications 19 3.1 wfBase ................................................................................................................. 19 3.2 HEMIS ................................................................................................................. 23 4. Outlook and Summary 27 1. The Importance of Multilingual Software Systems With Optical Storage Media for the European Economic Region Europe 1993 is a catch-phrase that is often heard. But opening the borders and removing trade barriers will not eliminate the cultural and language differences between countries. These differences are a concern for all firms and organizations that operate in more than one country. Overcoming the language barrier is not simply a matter of lexical comprehension and translation. It involves many levels of differing interpretations, meanings in various contexts, and adaptation of specialized vocabulary. In business and commerce, mere translation is not enough; the unwritten laws of the target specialist language must be adhered to. In addition, the organization working across national boundaries must take into account differing units of measure, currency, and conventions (date formats, addresses, orthography). Multilingual software is a requirement wherever users require access to the same information regardless of the nature of the source. This is particularly the case for: - Trading firms - Service firms - International authorities and institutions - Manufacturers with suppliers and subcontractors in more than one country - Communications firms - Banks - Insurance companies - Authorities and bodies such as police, air-traffic control, disaster relief organizations, environmental monitoring agencies, etc. IMC Congress, Brussels Page 2 of 28 © Copyright PROJECT CONSULT GmbH 1993
  • 3. Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Multilingual Informations and Retrieval Systems Technology and Samples - Others English is often used as a de facto communications standard. However, the use of a language that is foreign to its speakers can lead to misinterpretations and misunderstandings when the user is not familiar with the exact meaning, interrelationships, and contextual significance of terms and phrases. A "working knowledge" of a language is not enough. As software and the information underlying it become more complex, the support provided by the software must become friendlier and more comprehensive. This is especially true for the user interface, information on current actions, status messages, context-sensitive information (especially with user mistakes or critical program branchings) and help screens. The latter must be available in index form as well as context-sensitive. Modern "Windows"-oriented programs generally includes these features. However, like most programs the information they contain is available in only one user language. Most standard software today comes from a few leading software houses in the United States. Consequently the software and documentation is available in English, or American English, first. The various language versions are then translated from the original English version. The translations become available with more or less delay in various release standards, depending on the relative importance of the national market. In such standard software, the screens, associated texts, etc. are contained in the main body of the program, making translation with adaptation of the screens and texts a very complex undertaking. Even when different users access identical information, they cannot change the user language while the program is running. Instead, the complete target language version must be started, at considerable cost in time. In addition, most standard software lacks the integrated database or resource management components that enable the administration of different language and function modules, not to speak of the creation and maintenance of such modules. Thus, "traditional" standard software has a built-in a bias against multilingual use. This article will examine database information systems that are suitable for multilingual applications. 2. Software Design Like the ability to load modular program segments and functions separately, multilingualism must be designed in from the start. It is well-nigh impossible to modify finished software to support multilingual operation. In such cases, it makes more sense to completely redesign the software using modern tools. 2.1 Structural and Other Requirements for Multilingual Software Multilingual software is subject to the following design criteria (Fig. 1): a) Modular design with clear logical and software partitioning of the various levels (user interface, main program, resources, transformation modules, database, etc.). Interaction is controlled through messages and global variables. IMC Congress, Brussels © Copyright PROJECT CONSULT GmbH 1993 Page 3 of 28
  • 4. Multilingual Information and Retrieval Systems Technology and Samples Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH b) No text components may be contained in the program segments responsible for execution, but must be referenced by variables. The user can switch from one language to another using a global variable. c) Texts are kept in resource libraries and accessed by variables. The libraries must be simple to maintain and the texts must be accessible and loadable in the application during runtime. d) All parts of an application must have defined interfaces. This is particularly the case for the user interface, the actual application itself, the operating system and all additional application modules. e) The application, user interface and operating system must support variable text field lengths and positions, since these can vary greatly from language to language. f) The application, operating system, screen and printer drivers, and the database must support a variety of fonts, character sets, sortings, data formats, etc. This requires that the underlying operating system support this. Multilingual Software - Design Principles Modular design with clear separation of user interface, operating system and application (database) Every text component has to be referenced by a key variable in the application Resource libraries easy to link and to maintain (i.E. text editor) Defined interfaces between the user interfaces, operating system and application modules Variable textfield positions and field-lengths in the user interface modules of the application Support of different sets of fonts, language specific characters, keyboard layouts, date formats etc. by the underlying operating system Fig 1: Multilingual Software Design Criteria Such a multilingual application is thus divided into several inter-communicating modules and levels (Fig. 2). The actual application program, which can be part of the database, uses messages and global variables to control the language selection, display and printing, and the search and conversion functions. IMC Congress, Brussels Page 4 of 28 © Copyright PROJECT CONSULT GmbH 1993
  • 5. Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Multilingual Informations and Retrieval Systems Technology and Samples Principles of Language Display during Runtime Screen Text Field German English French Spanish Language Resources Please select ... Application Resource Data Data- base Selection Lists Thesauri Trans- formation Modules The language selector (Lx) in the application defines which resource is used for display and how the information in the datafield is represented Screen LX LX LX Fig. 2: Language display during runtime of a multilingual program The variable "Lx" ("Language Resource") determines which texts will be displayed, and which transformation modules and selection lists will be used to control an entry or search in a selected language. The information in the database itself is not changed, but only the screen display and printout. Figure 3 shows the levels of a multilingual application. Language Resources Trans- formation Modules Selection Lists Thesauri Language Interpreter User interface (Windows, Presentation Manager, X-Windows, etc.) Database User interface (Application) Application Operating System IRS Information Resources Management Driver 4 1 2 3 Layers and Modules of Multi-Lingual Software Fig. 3: Multilingual software levels (1-4) and modules Level one essentially handles the presentation of the information, Level 2 converts the information from one language to another, Level 3 manages the access information and handles searches, and Level 4 manages the "documents" (datasets, images, graphics, etc.) on optical storage media. This article will not go into Level 4, the "IRS" Information Resources Management Program, in greater detail (compare IMC Congress, Brussels © Copyright PROJECT CONSULT GmbH 1993 Page 5 of 28
  • 6. Multilingual Information and Retrieval Systems Technology and Samples Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Kampffmeyer, Ulrich: "Combined WORM and Magneto-optical Mass Storage Devices and Procedure-Oriented Information Processing Systems", GI Gesellschaft für Informatik, Arbeitskreis "Datenbanken", Conference at the University of Oldenburg, Germany, on Feb. 19, 1990). Levels 1-3 and their components will be explained below. 2.2 User Interface and Application The user interface depends to a large extent on the underlying operating system. Many operating systems are not up to the demands of multilingual software, since they do not allow for reconfiguration during runtime and do not support international character sets and formats. Operating systems with graphic user interfaces, like Microsoft Windows and OS/2 Presentation Manager, and operating systems based on XWindows (OSF Motif, OpenLook, etc.) are suitable. These systems allow control of the screen largely independent of the actual operating system itself. There is a fundamental difference between a) The standard Windows interface, and b) The application-specific interface implemented on the basis of this interface. The application-specific interface uses the tools provided by the standard interface to represent the functions of the application. A graphic user interface has numerous advantages: A lower learning curve, integrated help functions, and simple operation by mouse, menus, or key combinations. Another advantage of Windows interfaces is the unrestricted user- sizing of windows and other displays. It would be impracticable to give all displays of a multilingual application their own user interface, since this would severely limit the number of compatible screen and printer drivers. The application's user interface should use standard Windows interface routines wherever possible. The user interface (Windows as well as application) of a multilingual application should include the following (see Fig. 4 and 5): a) Change of key assignments for differing language keyboard layouts during runtime by the application b) Change of screen display during runtime by the application c) Display of language-specific character sets (e.g. German: ä, ö, ü, ß; French: é, è, ê, ç; Spanish: Í, ñ, ¿, ¡; Danish: å, æ; Hungarian: ÿ, ý, ï; Greek: a, b, c, etc.) d) Change of formats, as for date, currency, time, etc. during runtime e) Automatic adaptation of screens and fields to differing text lengths, special symbols, fonts, etc., under the given monitor resolution f) Language-specific context-sensitive help based on the cursor position, current program status and the feasible or just completed action. g) Modules loadable during runtime without leaving the program IMC Congress, Brussels Page 6 of 28 © Copyright PROJECT CONSULT GmbH 1993
  • 7. Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Multilingual Informations and Retrieval Systems Technology and Samples Operating System and User Interface Requirements for European Software The operating system and Window Interface must support several features to enable switching the language during runtime Change of keyboard setting and screen display during runtime via external program Enhanced keyboard setting with special characters: European languages ( ç, ê, æ, å, ø ,ä , etc. ). Support and change during runtime of date and time formates Graphic Interface with virtual Window architecture to allow different sizes of screens and fields while changing the language Context-sensitive help in relation to the actual position of the cursor Fig. 4: Operating system and user interface User Interface (application) Requirements The user interface has to support several functions to enable change of language during runtime Object oriented software Change of screens, settings and styles during runtime Dynamic positioning of fields Automatic adaption of different field lengths Controllable by the application program Loadable modules during runtime for messages, windows and helptexts Dynamic data and message interchange with operation system and user interface, application program and database Fig. 5: User interface The most important feature of a multilingual application is convertibility during runtime, without having to load and start another program and without changing the screen and screen information content (Fig. 4). IMC Congress, Brussels © Copyright PROJECT CONSULT GmbH 1993 Page 7 of 28
  • 8. Multilingual Information and Retrieval Systems Technology and Samples Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH The text components are kept in separate files, called "resource libraries" (Fig. 2). The resource libraries can be loaded on the fly by language selection variable Lx. For language resources to be usable, all texts in a program that are going to be displayed or printed must be referenced by an unambiguous key variable with the appropriate library. Resource libraries must exist for: a) All static texts in dialogue boxes and masks. These are texts which are associated with a given dialogue box and do not change. b) Dynamic texts in dialogue boxes and masks. These are texts which change, appear, or disappear according to status (messages). This includes the graying out of inoperative or unavailable functions on menus and buttons. c) Help texts which appear automatically or interactively. d) Error messages, system messages and other operation-related messages. Language Resources Requirements Language resources are used for displaying texts related to the unique keys in the application Loadable modules for each language Every entry in the language resource is referenced by a unique key which may be used by different applications and the database itself Language resources are needed for Every text on a entry or search screen form Every message Every helptext Icons adapted for each country Editor or tools for translation support Fig. 6: Language resources Many applications use icons and buttons to simplify option selection. If they bear text or abbreviations (such as "B" for bold), these must be converted when the language is changed (thus, in German "F" for "Fett" = bold). For this reason these icons and buttons should likewise be kept in dedicated resource libraries instead of being managed directly in the program. The same applies to icons with graphics, where the graphics do not bring across the same meanings in a different language area or country. IMC Congress, Brussels Page 8 of 28 © Copyright PROJECT CONSULT GmbH 1993
  • 9. Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Multilingual Informations and Retrieval Systems Technology and Samples Object-oriented programming languages and databases often support the use of loadable resources, making them preferable to traditional programming tools. The right choice of tools is important for the creation of applications based on a programming language or database. The application is the superposed, integrative component of the system as a whole (compare Figs. 2 and 7). The application contains not only the usual data-processing algorithms and input/output modules, but also the control and selection of language resources (transformation modules, selection lists, thesauri, help texts, messages, screen layout and display, etc.). Application Characteristics Object oriented message driven program Direct control of database and user interface Numeric keys for every text entry related to the screen display and database fields Transformatters, selection lists, thesauri, language interpretors and language resources as loadable modules Database as loadable module or server-client-communication via SQL Fig. 7: Components of the application Object-oriented programs with a "message" concept, such as Microsoft Windows, allow continuous control of the resources used and the condition of the screen. Direct communication should be set up for control of the modules on level 2 (Fig. 3). SQL can be used as a standardized interface for communication with the database in which the actual information is kept and managed. All modules on levels 2, 3, and 4 (Fig. 3) should be directly accessible or loadable during runtime. 2.3 Transformation Modules The numerical information in the database is stored in a format that can be converted as needed for a given onloaded language resource. This conversion is controlled by the variable "Lx" (Fig. 2). Transformation modules are considerably easier to implement than text translators, since they work by exact rules and with numeric values only (Fig. 8). IMC Congress, Brussels © Copyright PROJECT CONSULT GmbH 1993 Page 9 of 28
  • 10. Multilingual Information and Retrieval Systems Technology and Samples Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Transformation Modules Types Transformation modules are used for the display transfomation of numeric values of the database Transformation of date formates Transformation of time formates Transformation of addresses Transformation of units of measure Transformation of international standardized nomenclature Transformation of user-defined values (supported by operating system) (supported by operating system) (position of postal codes, etc. ) (litre to gallon, km to mile, etc. ) (country and city names, etc. ) (see selection lists, etc. ) Fig. 8: Transformation modules The most important standard transformation modules are: a) Date formats This module toggles the display format of dates between American (month-day- year) and European (day-month-year). This function is often supported by the operating system directly, and allows use of either the months' full names or their abbreviations. The transformation module should be designed to cope with the conversion of pre-2000 dates into the next century. This is important for all data which must be retained for several years. The date transformer module must also ensure the proper sorting during display. b) Time formats The same applies to time-display formats. For firms active on an international scale, data is best stored in "Coordinated Universal Time" format (UTC). Date and time transformation modules can be set up to check whether the system's internal time setting is correct (the current date and time must always be later than that of the last document to be saved; calibration with standard working hours and days, etc., in order to be able to determine system down time if necessary). c) Address conversion Address-format conversion affects printouts more than it does on-screen displays. Addresses in Europe are not standardized, and use a variety of sequences of street, house number, and postal code. This transformer module recognizes the country of the addressee and selects the appropriate address format for printouts. IMC Congress, Brussels Page 10 of 28 © Copyright PROJECT CONSULT GmbH 1993
  • 11. Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Multilingual Informations and Retrieval Systems Technology and Samples d) Units of measure This is an important requirement for international trade and manufacturing companies. For example, in the oil industry large quantities of different types of oil and petroleum products are transported and handled daily. Measurement values and with them customs and tax rates constantly fluctuate, depending on the type of product and its specific weight and even on the ambient temperature. In cross-border trade the units of measure as well as of currency must be automatically converted. The most important categories are units of currency, distance, weight, and volume. e) EDI data Standardised electronic data interchange (EDI), such as EDIFACT, allows entire business transactions to be handled electronically, without paper originals. The data is archived digitally. For display and printouts, EDI codes are converted into text. This conversion can be made language-specific through a language control variable. With EDI data it is necessary to know what version of a given EDI application the data will be converted with. Further transformation modules can be added to cover other requirements for specific industries and applications, for example converting product codes into text. 2.4 Selection Lists Graphic interfaces like Microsoft Windows support single and multiple selection lists (Fig. 9). With single selection lists only one item on a given list can be marked and processed. With multiple selection lists, one or more items can be selected. Selection Lists Characteristics Selection lists are an easy way to translate information and to spare storage capacity The list displays a text on the screen related to a database value Every entry in a selection list refers to a value which is related to a database field Every entry in the different language versions of a list refers to the same value The database has to store only the numeric value of the entry Selection lists help to standardize nomenclature in multinational and multilingual organizations Selection lists can be used as single and multiple-choice lists Fig. 9: Selection Lists IMC Congress, Brussels © Copyright PROJECT CONSULT GmbH 1993 Page 11 of 28
  • 12. Multilingual Information and Retrieval Systems Technology and Samples Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Selection lists offer several advantages over regular text-entry fields in database applications: a) No typing mistakes b) Selection lists keep the database uniform and ensure that entries can be easily found again. Since the user must decide from among a set of given expressions, entries are standardized. c) The database stores only a reference number which refers to a text resource. This keeps space requirements low, and different text resources can be accessed depending on the language variable. Retrieval is faster, since the system must search only through predefined numbers instead of text sequences. d) Multiple selection lists facilitate the multiple allocation of a document and allow the user to select a number of related items if he/she is unsure about the allocation to a single one. c) above is the most important factor for multilingual applications. The use of reference numbers allows linkage to multiple lists in different languages. The reference numbers can also be used to limit access, so that only cleared items are shown in a search. Selection lists also facilitate data entry through the use of presettings for recurring entries. Selection lists can be created with standard text editors. However, this should be done only by authorised persons, since changes to and especially deletions of entries characteristics (entries in a selection list) can compromise the consistency of the database. Strict update and maintenance rules are a must for distributed systems and resources. Selection lists with restricted vocabulary are the ideal medium for standardising terminology within a company and for creating multilingual software systems. Multilingual systems should avoid free text entry wherever possible and use selection lists whenever feasible. 2.5 Thesauri This term has widely differing meanings. In its original meaning it refers to a defined specialist terminology, broken down hierarchically from the general down to the precise. The terms differ clearly from one another and are distributed over several hierarchical levels. A generic term at one level branches into a number of more precise terms on the level below it. All terms at a given level should be at a similar level of detail. However, in many word-processing programs the "thesaurus" is simply a utility showing possible synonyms. This familiar kind of thesaurus is completely unrelated to the structured terminology system described above, as for example defined by the International Standards Organization (ISO) for single- and multilanguage thesauri. IMC Congress, Brussels Page 12 of 28 © Copyright PROJECT CONSULT GmbH 1993
  • 13. Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Multilingual Informations and Retrieval Systems Technology and Samples Thesauri Thesauri offer a hierachical structured and crosslinked nomenclature One field on the screen may be represented by a structured hierarchical thesaurus Similar to a selection list, the thesaurus displays a text related to a database value related with this text The thesaurus offers navigation and interpretation tools The Thesaurus is a database of itself which relates numeric values to texts and provides additional structure by hierarchic order and crosslinks The structure of thesauri is standardized by ISO The same thesaurus may be used by different applications Fig. 10: Thesauri Seen from the outside, the thesauri we are discussing here for multilingual systems act similar to selection lists (Fig. 10, compare also Fig. 9). First a list of generic terms is displayed (ISO Top Term; TT). Once a top term has been selected, the more precise terms subordinated to it are shown on a second list (ISO Narrower Term; NT). When one of these is selected it forms the new generic term (ISO Broader Term; BT) for the next level of narrower terms (Fig, 11). This strict hierarchy is fully applicable to only a few subject areas. Therefore, the ISO standard provides for crosslinks. These link terms from different levels and branchings independently of their position in the hierarchy. This is easier to follow on a program than it is to describe in print. An electronic thesaurus is referenced by numbers in the program just as is a selection list (which see). However, unlike a selection-list entry, a thesaurus entry includes not only a "unique identifier" number in the database, but also flags which specify its display position (level and branching in the hierarchy) and the type (and if necessary direction) of links. The links allow a term to be associated with more than one top or broader terms in other branchings, as well as the linkage of a broader term to several narrower terms in other branchings, regardless of the position in the hierarchy. The use of different links (uni-directional, bidirectional, broad-to-narrow, narrow-to-broad, additional reference, synonym, etc.) make it easier to navigate in such a system. In principle the electronic thesaurus is an entire database application, which stands between the user interface and the database proper. The database proper stores only the unique identifier. If this is referenced with a "narrower term", using its links and hierarchical position all associated broader terms up to the top term can be found. IMC Congress, Brussels © Copyright PROJECT CONSULT GmbH 1993 Page 13 of 28
  • 14. Multilingual Information and Retrieval Systems Technology and Samples Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Thesauri Hierarchy and Crosslinks The Hierarchical View of the Thesaurus (Top Term, Broader Term, Narrower Term) The Network Structure of the Thesaurus (Crosslinks independent of the hierarchical position) 7 6 5 4 3 21 7 6 8 4 3 21 1220 1210 1120 1110 1200 11001000 1220 1210 1120 1110 1200 11001000 Unique identifier Position in hierarchical view Fig. 11: Hierarchy and virtual linkages (crosslinks) An electronic thesaurus is represented internally as a network (relational system), but to the outside as a hierarchy. Thus, the composition of a list of terms depends not only on the broader term, but also on the links and the route taken to get to the broader term. Unlike with a selection list, the lists displayed by an electronic thesaurus can differ from situation to situation. In addition to assisting in navigating by displaying the selection lists specific to a broader term selected previously, a database-supported thesaurus can also be used in "specialist" or "beginner" mode. When entering information, a specialist mode is best which allows entry of a narrower term or an abbreviation directly, with the system determining the associated broader terms without having to go through the hierarchy. However, users who are inexperienced with hierarchical selection lists or with the subject content of the thesaurus are better off doing their searches in beginner mode, whereby the system analyses users' text input, looks for a match in the thesaurus, and if in doubt shows a synonym list and help text suggesting a repeat attempt or a more closely defined query. Such a "global search" can also be done by further fields or other resources of the thesaurus. Fig. 12 shows how a number of "slices" are assigned to the reference keys of the thesaurus database. Each of the language slices contains all information on the hierarchical and network structure of the terms, since this will differ from language to language (narrower or broader terms, different semantic fields). However, regardless of the differences among the languages the same information must be clearly accessible in all. Therefore, the unique identifier is assigned not only the term itself main keyword), but also acronyms (e.g. "NASA"), homonyms (words that sound the IMC Congress, Brussels Page 14 of 28 © Copyright PROJECT CONSULT GmbH 1993
  • 15. Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Multilingual Informations and Retrieval Systems Technology and Samples same but have different meanings), synonyms, plural forms, explanatory notes, etc. This information is also accessed during a global search. The "language slices" need not necessarily contain foreign languages; they can also contain different aspects of a single language. This is particularly useful for specialist languages. Thus, one slice can contain the regular colloquial language, with only two or three levels and accessible to everyone, while another slice can contain the terminology for a specialist field broken down into more levels and accessible only to those working in that field. This allows control of the extent, depth, and accessibility of information. Thesauri "Slice"- Model of a Multilingual Thesaurus ID´s dec ID´s dec German Language "Slice" ID´s dec ID´s of pre- decessors ID´s of suc- cessors position in hierarchy main key wordn synonyms, ho- monyms etc. ID´s dec ID´s of suc- cessors position in hierarchy main key wordn synonyms, ho- monyms etc. Unique ID A1 Unique ID A2 Unique ID A3 Unique ID An ... ID´s of pre- decessors ID´s of suc- cessors position in hierarchy ... ... ID´s of pre- decessors main key wordn synonyms, ho- monyms etc. help text help text help text French Language "Slice" ID´s of suc- cessors position in hierarchy main key wordn synonyms, ho- monyms etc. help text ID´s of pre- decessors help text English Language "Slice" help text help text ID´s of pre- decessors ID´s of suc- cessors position in hierarchy main key wordn synonyms, ho- monyms etc. ID´s of pre- decessors ID´s of suc- cessors position in hierarchy main key wordn synonyms, ho- monyms etc. ID´s of pre- decessors ID´s of suc- cessors position in hierarchy main key wordn synonyms, ho- monyms etc. help text ID´s of pre- decessors ID´s of suc- cessors position in hierarchy main key wordn synonyms, ho- monyms etc. Fig. 12: Slice structure of a multilingual thesaurus In addition to the modular slice structure, an electronic thesaurus database offers many advantages: a) A standardized, controlled vocabulary ensures unambiguous and complete retrieval of all correctly entered information. b) Entry errors are prevented. c) Selection lists and help functions assist the user in finding his or her way through extensive, many-layered specialist vocabularies. d) Functions like "global search" enable searches to include synonyms, homonyms, acronyms and other references as well as the help text itself. e) The organization and structure of thesauri are internationally standardized. f) A thesaurus database acts as a pre-processor, saving time in searches in the database proper, since only short, unambiguous numerical references need be IMC Congress, Brussels © Copyright PROJECT CONSULT GmbH 1993 Page 15 of 28
  • 16. Multilingual Information and Retrieval Systems Technology and Samples Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH searched and evaluated. The thesaurus then converts the unique identifiers for display. g) Thesaurus databases can be run on a PC LAN, thus reducing the workload on the central database and information resources management (IRS; see below and Fig. 3). If the system includes optical-systems management software in addition to the thesaurus database and the database proper, it has a three-level database hierarchy (compare Figs. 3 and 26): a) Database for one or more thesauri (local or central) b) Database for managing unique identifiers to selection lists and thesauri and for managing database entries (numerical, alphanumeric, date, time, Boolean variables, etc.) c) Information retrieval and access system (RIAS). As a rule a non-standard database for managing WORM (write-once) media, erasable, rewritable, and M/O optical media, or read-only media (CD-ROM). A standard database (preferably relational) can be used for the thesaurus database as well as for the database proper. Full-text databases are not suitable for this type of application (Fig. 13). Database Characteristics Support of optical disk information retrieval system for mass data management Standard relational database may be used to manage data (except for language interpretation) Standard fulltext database are not usable Fig. 13: Database characteristics 2.6 Fulltext Translation The electronic interpretation and translation of running text requires very different strategies from those described up until now. Transformation modules, selection lists and thesauri can be combined in a system as desired, since they all work by the same rules: Numerical identifiers are transformed into predefined expressions in defined ways. A system capable of analysing running text is difficult to combine with these modules. It is an independent and complex software system made up of many component parts (Figs. 14 and 15). IMC Congress, Brussels Page 16 of 28 © Copyright PROJECT CONSULT GmbH 1993
  • 17. Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Multilingual Informations and Retrieval Systems Technology and Samples Language Interpreter Characteristics The language interpreter contains different modules which allow translation and interpretation of fulltext databases. Dictionaries provide information for the direct translation of nouns (singular, plural, conjunctions, etc.) Statistical modules support the interpretation of the noun inside a text Linguistic modules support the interpretation of the grammatical context Comparision modules combine the different strategies of interpretation Presentation modules display the answer of a query in the chosen language as translated fulltext Inverted file and cache modules optimize access Fig. 14: Components of a language translation system Language Interpreter Structure Presentation Modules Linguistic Modules Statistic Modules Dictionaire Modules User Interface Entry Query Comparision Inverted File Database Display Language Interpreter Fig. 15: Structure of a language translation system IMC Congress, Brussels © Copyright PROJECT CONSULT GmbH 1993 Page 17 of 28
  • 18. Multilingual Information and Retrieval Systems Technology and Samples Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH a) Dictionaries contain the individual words in their different forms (plural, singular, declined, conjugated, irregular verb forms, etc.). As a rule the dictionary will constitute a database application of its own. However, it is completely different in structure, makeup and content from the thesaurus discussed above. b) Statistics modules analyse the occurrence and composition of words and combinations of words. c) Linguistic and grammatical-analysis modules are the most difficult part. They must contain all the rules and comparative examples required to analyse syntax. Pattern recognition and fuzzy logic techniques are often used for this purpose. d) The results of a), b) and c) above are combined, evaluated and interpreted in a comparison module. The comparison module is designed so that intermediate results of one module can be returned to another module for evaluation. This gives rise to an iterative process with a relatively high rate of recognition in texts on specific subjects for which there are electronic dictionaries containing the subject terminology. e) Due to their architecture, traditional databases are not very effective at time- consuming text analysis. To speed things up, special cache and inverted file modules are often used as intermediaries. f) Presentation modules handle the correct on-screen presentation of the translated text. They work with information from the dictionary module, the evaluated text from the database, and the inverted file system. The running text interpretation system we have described can be used to evaluate queries in regular text. Fig. 15 shows the processing path for a query. The system goes through the modules from bottom to top in the same way to convert a text out of the database. The system shown here is just one possible configuration. Since this technology is very new, many other approaches are being investigated. This particular approach has the advantage that different modules with differing evaluation strategies can be consulted simultaneously. Furthermore, each module can be dedicated to certain languages or vocabularies, and accessed automatically by the comparison module as needed. The interpretation and translation of a text is very time-consuming, and usually possible only on very fast dialogue computers. Complex systems such as the one described should not be confused with simple translation aids. Traditional full-text databases are seldom suitable for such systems. Standard database software uses a strategy of leaving out filler words, adjectives, adverbs, etc. in order to save memory space and increase database speed. However, a language interpretation system needs all of the information contained in the text, since otherwise coherent, context-adequate translation is not possible. "Language Interpreter" database systems have enjoyed initial successes with the UNO and the European Commission. The choice of a system for multilingual database applications is still simple at this point: a) For document-oriented (facsimile) systems, applications with controlled vocabularies, and systems intended to bring about a standardization of use, the transformation, selection list and thesaurus approach is the right choice. IMC Congress, Brussels Page 18 of 28 © Copyright PROJECT CONSULT GmbH 1993
  • 19. Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Multilingual Informations and Retrieval Systems Technology and Samples b) For full-text applications which will not go into full use within the next three to four years, the approach described in this section should be attempted or at least examined. At present there is no commercial software immediately available for either application, nor are off-the-shelf solutions likely to become available in the future, since the nature of the application and the vocabulary will be subject to constant change. However, in my opinion an approach as shown in Fig. 3 is ideal. It combines the different transformation and interpretation components in one level where they work in parallel. They link the user interface with the database proper. This integrative approach combines the advantages of all of the techniques named, which can then be used individually or in combination as needed. 3. Sample Applications We will now look at multilingual information and retrieval systems from the user's point of view, using three examples. Application Examples HYPARCHIV Standard optical filing software for Microsoft Windows in 9 languages wf Base Distributed press and commercial information system in 4 languages based on ISDN-Knots (wf, Switzerland) HEMIS Meta-database and information system for environmental data; Informations, programmes, methods, etc. for CD-ROM-distribution (UNEP/HEM, worldwide) Fig. 16: Application examples a) wfBase Press and economics information in a distributed document- imaging system b) HEMIS Environmental information on CD-ROM 3.1 wfBase IMC Congress, Brussels © Copyright PROJECT CONSULT GmbH 1993 Page 19 of 28
  • 20. Multilingual Information and Retrieval Systems Technology and Samples Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH wfBase was developed specially for the Swiss Institute for Commercial Development (German "Wirtschaftsförderung", hence "wf"). It has been in operational use since 1992. The Swiss Institute for Commercial Development is located in Zürich, with offices in Geneva, Bern and Lugano. Prior to the introduction of wfBase, dossiers on political events, economic data, and the like were kept independently at all four locations. The goal of wfBase is to enable access by all Institute users to all press articles, periodicals, and Institute documents, independent of the language of data entry (Figs. 17 and 18). wfBase wf Schweitzer Wirtschaftsförderung Swiss Institute for Commercial Development Zürich - Geneva - Bern - Lugano The wf owns one of the largest archives on commercial and political topics in Switzerland. It provides information to politicians, journalists and its commercial members representing all major companies of Switzerland. Optical filing system for press and commercial documents (scanned and created via word processor, sreadsheet, etc. ) Distributed system linked via SwissNet 2 (ISDN) Access for wf-employees and third-party partners via multilingual graphic user interface (ISDN and telephone modem) Database with 4-lingual thesaurus Access to information independent of the language in which it was entered Several million documents stored on M/O-Jukeboxes (2 times 50 gigabyte) Integrated bureau communication with textprocessing, spreadsheet, FAX, library management, electronic mail, accounting, address database, etc. Fig. 17: wfBase - Features IMC Congress, Brussels Page 20 of 28 © Copyright PROJECT CONSULT GmbH 1993
  • 21. Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Multilingual Informations and Retrieval Systems Technology and Samples wfBase Storage and Communications Layout Harddisk Cache Zürich Images, Files & Descriptors Read / Write / Create Harddisk Cache Lugano Images, Files & Descriptors Read / Create Harddisk Cache Geneva Images, Files & Descriptors Read / Create Harddisk Cache Bern Images, Files & Descriptors Read / Create wf-User wf-User wf-User wf-User Jukebox External Use Jukebox Internal Use DB Server Zürich Archive - Server Zürich Communications - Server Zürich External User Harddisk Cache Addresses Library Dossiers ISDN Telephone Modem& ISDN SwissNet 2 ISDN SwissNet 2 ISDN SwissNet 2 Novell Netware Fig. 18: wfBase - System configuration with internal and external users and information management in two jukeboxes (Zürich) IMC Congress, Brussels © Copyright PROJECT CONSULT GmbH 1993 Page 21 of 28
  • 22. Multilingual Information and Retrieval Systems Technology and Samples Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH wfBase also integrates other applications besides document management under its graphical user interface, such as word processing and spreadsheet applications, address and library management, billing for outside users, electronic faxing and mailboxes, etc. The wfBase system makes use of some HYPARCHIV modules, but is otherwise an independent application with client-server architecture and a relational database on an OS/2 server. The MS Windows workplaces are linked together in a Novell network. Outside users can access wfBase by modem, query documents ("subsets"), and display and print them locally or have wfBase fax the documents to them. The four wfBase locations are linked by SwissNet2 (ISDN). This powerful network allows compressed scanned facsimile transmission. Two jukeboxes store scanned facsimiles, locally-generated data, and incoming faxes. The system is highly error- tolerant and largely fail-safe. At the heart of wfBase is the database with a quadrilingual (German, French, Italian, English) thesaurus for subject-area classification. The thesaurus includes over 2000 subject areas, organized hierarchically and in linked structure over four levels. wfBase Multilingual Thesaurus The two images show different views of the thesaurus for thematic keywords (here in German). The thesaurus supports the user in navigation, jump-functions, short-key-entries, synonym-retrieval and other techniques for easy-to-use access. Screen I aus Vortrag Online ´92 Thesaurus-Maske Sachgebiet für Vortrag auf Folie einkleben Screen II aus Vortrag Online ´92 Thesaurus-Maske Sachgebiet für Vortrag auf Folie einkleben Fig. 19: wfBase multilingual thesaurus, showing two windows of the thesaurus screen. The left shows the branching from a broad term to a list of narrower terms. The thesaurus contains the subject areas covered in the dossiers. IMC Congress, Brussels Page 22 of 28 © Copyright PROJECT CONSULT GmbH 1993
  • 23. Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Multilingual Informations and Retrieval Systems Technology and Samples In addition to the thesaurus, there are selection lists for other fields and fields for text and data entry. The database enables the user to locate documents regardless of the language in which they were entered. However, the system displays documents only in their language of origin; in a multilingual country like Switzerland it is not necessary to translate the contents of documents, as users are expected to be multilingual as a matter of course. Instead, the objective of wfBase is to improve communication between office locations, standardize addresses and documentation, eliminate redundancies, and provide third parties (members of the wf's supporting organizations) with a simple, time-saving and cost-effective means of access. 3.2 HEMIS Within the United Nations Environmental Programme, or UNEP, there is an organization called UNEP/HEM (Harmonization of Environmental Measurement) which is responsible for the harmonisation of environmental monitoring methods, plans, projects and information. Since 1990 a project has been underway at the Munich UNEP/HEM office to immplement an information and meta-database system for the UNEP/HEM, called HEMIS (= HEM Information System). HEMIS is intended to provide an overview of: a) Current global and national environmental projects by the UN and other international and world organizations b) Institutions, research emphases, periodicals, and key personnel c) Methodology, reference materials, etc. d) Databases, data formats, data quality, access, etc. The information contained in HEMIS is meta-data compiled from widely varying sources (Figs. 17 and 24). HEMIS UNITED NATIONS ENVIRONMENTAL PROGRAMME HARMONIZATION OF ENVIRONMENTAL MEASUREMENT UNEP / HEM, Nairobi / Munich The UNEP / HEM Office harmonizes nomenclature, measurements and other information used worldwide in environmental projects. This task will be supported in the future by the HEMIS meta-database and information system, a multilingual CD-ROM using PC-system. Multilingual thesauri for scientific nomenclature, countries, climates, etc. with references, links, synonyms, homonyms, acronyms and wildcard-functionalitity Harmonization of nomenclature by standardized access to Information Hyperlinks, guided tours, global search facilities together with the thesaurus enable easy access to the Information independent of the language of entry CD-ROM based worldwide distribution Fig. 20: HEMIS - Information and meta-database system of the UN environmental organization UNEP/HEM IMC Congress, Brussels © Copyright PROJECT CONSULT GmbH 1993 Page 23 of 28
  • 24. Multilingual Information and Retrieval Systems Technology and Samples Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH The goal is to harmonize access to heterogeneous information of varying quality and extent from varying sources. HEMIS is made up of two component systems: a) One system will be installed in Munich with which all information can be collected, processed, the contents made readily accessible, and managed. The system is intended to be able to create reports (printouts) selected from its database and to create CD-ROM databases. b) The other will handle worldwide distribution of extracts from HEMIS in Munich by CD-ROM in regularly updated editions. The two component systems will have differing user interfaces, databases, etc.1 System a) is a production system that will generally be used only by UNEP/HEM employees. System b) is designed to provide information internationally on environmental projects, prevent parallel developments, and supply basic project and database data, even if the information is not available in the user's own language. The HEMIS CD-ROM will be made as attractive as possible so that it is widely used, and so that other institutions not associated with the UN will be motivated to supply data for the system (Fig. 21). INFOTERRA H E M I S Methods/ Models Classification Systems Data- bases Programmes Institutions Persons H i g h L e v e l D a t a M o d e l ESA EEA-TF WMO GEMS IAEA Others Governments NGOs UN UNEP EARTHWATCH  Examples of sectoral / regional / specialized sources of environmental meta-data Users   Harmonization and Distribution of Information via HEMIS Fig. 21: Information harmonisation and and distribution by HEMIS. Data on paper, diskette and CD is read into the stationary HEMIS, selected and formatted, classified semi-automatically or manually following a defined nomenclature (thesauri), and finally distributed in the form of printed reports on specific subjects or on CD-ROM. This figure shows only a representative sample of the participating organizations. 1 At this writing (late 1992) HEMIS is still at the design and prototype stage. Not all components have been implemented as yet. IMC Congress, Brussels Page 24 of 28 © Copyright PROJECT CONSULT GmbH 1993
  • 25. Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Multilingual Informations and Retrieval Systems Technology and Samples The major components of both the stationary and the CD-ROM HEMIS systems are a number of electronic thesauri, structured as shown in Figs. 11, 12, and 22. The Internal Structure of the HEMIS Thesaurus A B C D E F G Unique Identifier (ID) IDs of prede- cessors (ISO TT, BT, links) IDs of followers (ISO NT, links) Main descriptor for display in the hierarchy of the thesaurus Position of “D” in the hierarchy Synonyms, acronyms, homonyms, interpretations, etc. of “D” Explanation Numeric One entry 8 digits Numeric Up to 64 entries 8 digits Sequence of digits Numeric Up to 255 entries 8 digits Sequence of digits Alpha- numeric One entry Up to 20 characters (due to display restrictions) Numeric One entry Up to 8 digits (max. of 8 hierarchy levels) Numeric Up to 255 entries Up to 40 characters each sequence of texts Alpha- numeric One entry Up to 255 characters Unique reference key for the descriptor database Retrievable via hierarchical selection list and global search Internal management Bi- directional Internal mangement Uni- directional For screen display in the hierarchical thesaurus only Retrievable via global search Available as context - sensitive help function Fig. 22: Structure of the HEMIS thesaurus for geographical units, climate zones, subject areas, and other hierarchically structured reference keys. For an explanation of the entries in the first row see Section 2.5 and Fig. 12. The thesauri and selection lists are part of both HEMIS systems. In the stationary system they are used in making key words for data sets, documents, graphics, images etc., and for searching and compiling data. If information is supplied on computer media in pre-agreed formats, some of the key-word creation process can be done by the system automatically. In the CD-ROM version the thesauri, selection lists and all other entries are used only for researching and compiling information. The HEMIS CD-ROM version has a multi-layer modular structure (see Fig. 23). IMC Congress, Brussels © Copyright PROJECT CONSULT GmbH 1993 Page 25 of 28
  • 26. Multilingual Information and Retrieval Systems Technology and Samples Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH HEMIS-System Layout with Multi-Lingual User-interfaces User Interface - (i.e. English) Descriptor database (field oriented database) Hyper- links (part of the stored objects) Information retrieval and access system (IRAS) Objects Texts Images Datasets Additional user interface in different languages Standard variables (alphanumeric, numeric etc) Numeric keys related to thesauri and selection lists Database of guided tour links Query by example Global search Thesauri Selection lists Guided tours Links language translator Fig. 23: HEMIS system layout with multilingual user guidance and search. The user interfaces in the various languages make up the first layer. The next layer is composed of modules for different search and navigation strategies, likewise language-specific. In addition to a database, HEMIS has prearranged "guided tours" and "links". The information and documents on the CD-ROM are managed by an Information Retrieval and Access System (IRAS). In addition to searching for certain key words or terms, HEMIS also offers navigation assistance in the form of prearranged "guided tours" and individual links. A global database search takes a certain amount of time, but it does allow the user to use the system without prior knowledge of what contents lie behind a given field in the search mask. The user interface can be toggled among different loadable languages, as can the thesauri, selection lists, links and guided tours. Free text input and scanned-in documents are not translated. HEMIS is intended to provide the initial information; the user can then consult the source institutions, databases, or publications for more in-depth information. Fig. 24 shows the proposed starting screen of the HEMIS prototype with the button fields for moving to the main subject-area screens. IMC Congress, Brussels Page 26 of 28 © Copyright PROJECT CONSULT GmbH 1993
  • 27. Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH Multilingual Informations and Retrieval Systems Technology and Samples Start Screen of the HEMIS-Prototype H E M I S Environmental Information Sys temInstitutions Programmes Ref. Material Methods Databas es Ins titutions RegionDatabas es Methods Ref. Mat. LocationThes aurusG uided TourProgrammes G uided Tours Subject Thesaurus Help Region Location FrancaisEnglis h Deutsch Choos e Wählen SieChois ir EXIT ? Fig. 24: HEMIS starting screen (suggested CD-ROM version) 4. Outlook and Summary The development of multilingual information and retrieval systems has only just begun. Conclusions MultiLingual Information and Retrieval Software The European Challenge for 1993 Multi-lingual software is a must for all companies and organizations working in different European Coutries The American software industry is presently unable to supply multilingual software - This is a window of opportunity for European software companies Multilingual software helps to bridge the national barriers within Europe Multilingual software is intelligent object-oriented programming using databases and information management systems as a framework for huge masses of coded and non-coded information Fig. 25: Summary of the most important arguments for multilingual software IMC Congress, Brussels © Copyright PROJECT CONSULT GmbH 1993 Page 27 of 28
  • 28. Multilingual Information and Retrieval Systems Technology and Samples Dr. Ulrich Kampffmeyer PROJECT CONSULT GmbH In this article, the following arguments have been advanced (Fig. 25): a) Multilingual software is a necessity for all organizations with Europe-wide or world-wide activities, for which a single "company language" is undesirable or impracticable. b) Multilingual software is available in its basic features as standard software, but as a rule it must be modified for the specific application before it can be used to full benefit (compare wfBase, 3.2, and HEMIS, 3.3) c) Multilingual retrieval software can be used for accessing large quantities of data or documents on digital optical storage media. d) Multilingual thesauri encourage standardization in document classification, enable clear and structured access to documnets, and support searches for documents not in the user's own language. e) Multilingual fulltext retrieval and translating systems are in use in prototype form. Combined with other techniques, such as thesauri, they will make easy-to-use information systems feasible in the future. f) Multilingual software is a market opportunity for European software and systems firms. g) Multilingual retrieval and information systems can be used to advantage in almost all areas of business and administration which extend beyond national and cultural boundaries. IMC Congress, Brussels Page 28 of 28 © Copyright PROJECT CONSULT GmbH 1993