Presents DITA markup for representing glossaries (<glossentry> and related elements) and references to them (<term>). Discusses strategies for how to assign and manage keys associated with glossary entries. Also discusses some of the processing challenges inherent in the glossary feature design.
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Can I Have a Word: Managing Shared Glossaries and References to Terms With DITA
1. Can I Have a Word:
Managing Shared Glossaries and
References to Terms With DITA
Eliot Kimber
Contrext
Tekom 2017
2. About the Author
• Independent consultant focusing on DITA
analysis, design, and implementation
• Doing SGML and XML for cough 30 years cough
• Founding member of the DITA Technical
Committee
• Founding member of the XML Working Group
• Co-editor of HyTime standard (ISO/IEC 10744)
• Primary developer and founder of the DITA for
Publishers project
• Author of DITA for Practitioners, Vol 1 (XML Press)
Tekom 2017
3. Agenda
• DITA glossary markup
• Glossary challenges
• Managing and using glossary entries
• Glossary processing
Tekom 2017
5. Glossary is…
• Terms and their definitions
• For presentation to readers
• May include definitions of acronyms and
abbreviations
• May include lexicographic details: part of
speech, etc.
• Source for use-by-reference of <term>
elements in content
Tekom 2017
6. Glossary is not…
• Formal term list as used in terminology
management tools like Congree or Acrolinx
– Terminology management is a separate concern
from glossary authoring and presentation
Tekom 2017
7. General Requirements
• Provide glossary of terms in publications
• Get terms by reference in content (mentions of
terms)
• Links from uses of terms to their glossary entries
• Show expansions of acronyms and abbreviations
on first use
• Reuse glossary entries in multiple publications
• Publish master glossary with links to it from other
publications
Tekom 2017
9. <glossentry>
• Topic type for glossary entries
• Captures:
– Term
– Definition
– Abbreviated forms
– Parts of speech
– Surface form
– Other details
Tekom 2017
10. <glossgroup>
• Topic type for grouping glossary entries
together into one source document
• Allows nested <glossentry> elements
Tekom 2017
11. <glossref>
• Topicref type for referring to glossary topics
• DO NOT USE
• Sets @toc to “no”
• Sets @print to “no”
– Nobody knows why
• Requires @keys attribute
Tekom 2017
12. <abbreviated-form>
• Reference to a glossary entry
– Specialization of <term>
• Intended to produce abbreviation and
expansion on “first use”
• Produces just abbreviation on other
occurrences
• Challenge: When is a use the “first use”?
Tekom 2017
13. <term>
• Can use @keyref to use a glossary term by
reference
• Reflects the term if no local content
• Should be a link to the glossary entry
• Example:
<p>The <term keyref="gloss-framitz"/>
…</p>
Tekom 2017
14. <sort-as>
• Can be used in topic prolog to provide sorting
key
– Often required for Japanese
– May be required for Simplified Chinese
– Other languages, terms with special characters,
etc.
Tekom 2017
16. Glossary Entries as Resources
• Manage glossentry topics as individual docs
– Typical DITA practice for topics in general
• Must have associated keys
• Challenges:
– Where to define the keys?
– Defining naming conventions for keys
Tekom 2017
17. Maps for Glossaries
• Glossary entries MUST be part of the
publication navigation tree
• <keydef> is either not appropriate or not
sufficient
– <keydef> has processing role of “resource-only”
– Does not put referenced topic in the navigation
tree
• Need normal-role topicrefs to glossary entries
Tekom 2017
18. Grouping Entries
• Obvious approach is to use topicheads to group entries:
<topichead>
<topicmeta>
<navtitle>Glossary</navtitle>
</topicmeta>
<topichead>
<topicmeta>
<navtitle>A</navtitle>
</topicmeta>
<topicref keys="gloss-apple"
href="glossary/apple-gloss.dita"/>
…
</topichead>
…
</topichead>
• Doesn’t always work the way you might expect
Tekom 2017
19. Topichead Chunking Rule
• @chunk="to-content" on <topichead>
makes topic act like reference to a title-only topic
– DITA Spec: Clause 2.4.5.1 “Using the @chunk
attribute”
• Unfortunately, includes all child topics in the
resulting chunk
– Probably not what you want for glossaries
– Have to specify @chunk on each subordinate topicref
– Very annoying
• Bugs in Open Toolkit as of 2.5.4 produce incorrect
results in both HTML and PDF
Tekom 2017
20. Workaround for Grouping
• Create title-only topics for what would
otherwise be topicheads
– Glossary top-level topic
– Each group
• Will need these for each language-specific
group for localized glossaries
• Easy enough to generate
– Could do as extension to Open Toolkit
preprocessing
Tekom 2017
21. Challenge:
How to Define Glossaries in Maps?
• Two basic options:
1. Use normal-role topicrefs only
2. Use both resource-only topicrefs and
normal-role topicrefs that refer to the
resource-only topicrefs by key
• Depends on your reuse requirements
Tekom 2017
22. Map Organization Option 1:
Just Normal-Role Topicrefs
• Publication map has normal topicrefs to the
glossary entries
• Can have a single reusable submap
• Or can author separately for each publication
• Advantage: Keeps it simple
• Disadvantage: May have redundant or
duplicate authoring in different publications
Tekom 2017
23. Map Organization Option 2:
Keydefs + Normal Topicrefs
• Have a master map that uses <keydef> to refer to glossary entry topics
– These <keydef> keys are NOT to be used as target of <term> and
<abbreviated-term> elements
– Reflects “exactly one topicref with URI reference to a given topic”
policy
• In each publication:
– Grouping topicrefs
– Normal-role topicrefs with keys and keyref to <keydef> keys
• Advantage: Makes reuse easier to manage
• Disadvantages:
– Two keys where there were one
– May still have per-publication navigation structures for glossaries
Tekom 2017
24. Master Glossaries
• Separate publication that is just the glossary
• Cross-deliverable links from other publications to
glossary entries
• Cross-deliverable links are always a challenge
• DITA 1.3 provides cross-deliverable linking feature
– Probably not implemented in your tools as of
November 2017
• Can use deliverable-specific topicrefs
– Requires that you know how glossary entries will be
delivered
– Would expect to generate them automatically
Tekom 2017
26. Processing Challenges
• Determining “first use” for abbreviated form
references
• Automatic grouping and sorting
• Producing minimum glossary for a given
publication
Tekom 2017
27. First Use Problem
• What is the scope?
– Single topic?
– “Chapter”?
– Entire publication?
• Scope may be different for different
deliverable types
• May have different editorial rules
• Difficult to have a general solution
Tekom 2017
28. Automated Grouping and
Sorting
• Nothing in standard-defined map markup that
says unambiguously “this branch of the map is a
glossary”
• Need locale-specific configuration for grouping
• Need local-specific configuration for sorting
• Simplified Chinese needs special support
– DITA Community i18n project provides necessary
features
– Somebody needs to implement Open Toolkit plugin
for doing glossary sorting
Tekom 2017
29. Generating Glossary Based on
Terms Used
• Possible to generate a glossary that reflects
only those terms actually used in the topics
included in a publication
• Requires synthesizing normal-role topicrefs so
key references will work properly
• Could be implemented as an extension to
Open Toolkit preprocessing
• Could be a separate process that generates
otherwise-normal map and topic components
Tekom 2017
33. Your opinion is important to us! Please tell us what you thought of the
lecture. We look forward to your feedback via smartphone or tablet under
http://ta10.honestly.de
or scan the QR code
The feedback tool will be available even after the conference!