2. XSL
XSL is eXtensible Stylesheet Language for XML documents. XSLT stands for XSL Transformations.
Are you familiar with CSS (or cascading stylesheets) for HTML?
XSL helps to display XML documents and includes:
• XSLT – language for transforming XML documents
• XPath – language for navigating XML documents
• XSL-FO – language for formatting XML documents
In this presentation, we will be looking at XSLT 2.0 and XPath to transform an XML document into another
XML document.
3. XPath
XML is a document and uses a language called XPath to navigate these documents. Xpath is
essentially to know for XSL.
XPath navigates “nodes” and there are 7:
◦ Elements
◦ Attributes
◦ Text
◦ Namespace
◦ Processing-instruction (<?name value?>
◦ Comment (<!-- -->)
◦ Document nodes
4. XPath and Nodes
The XML document is a tree of nodes.
In this example….
<bookstore> is the root element node
<author> is the element node
lang=“en” is the attribute node
Harry Potter or “en” are Atomic values or Items.
<book> is the child of <bookstore>
<title> is the sibling of <author>
What is <year>?
<book> and <bookstore> are the ancestors of <title>
What are the descendants of <bookstore>?
9. Start with an XML document
We’re working with a very small MODS export from Open Refine
Open-Refined-Farrel-xslx.txt
Before anything, rename this file with the extension “.xml”. Then, open this up in your xml editor
and remove all “null”. You might also have to take care of & or any html that you forgot to strip
in Open Refine such as <br>. You can use the Find and Replace Function in your text editor.
10. XML Document
What is the structure of this XML document? Can you find the root element? What are the
siblings, ancestors, descendants, etc.?
Is the XML document associated with a metadata standard? What is that standard and its
requirements?
11. What do you want to do?
Now that you understand the structure of your XML document, what do you want to do with it?
It is necessary to construct a sort of story about your XML document.
In the CTDA, we want to create individual MODS records that follow the CTDA MODS
implementation guidelines from the source XML document.
This is only one story. You can do other things with XSLT such as:
• I want to display my XML document in any browser.
• I want to count how many times the word “the” is used.
• Etc.
12. We want MODS XML documents
We have a source XML document. In our example, how many result MODS XML documents do
we need?
Does the source XML document and resulting MODS XML documents have the same structure?
What’s different?
13. What is our story?
Our tale begins with a small XML document. This xml file is not
associated with any namespace, and hence prefixes. It is not
associated with any metadata standard. This xml file has only 2
records.
The xml file has the following outline:
Root: metadata
Child of metadata: record
Children of record (grandchildren of metadata): Title, Creator,
Place, Topic, TopicPerson, Genre, Contributors, Datecreated,
FileName, Source, Collection, Rights
14. Story Continued
We want to create a folder (preferably) that
contains the individual MODS Records.
Each MODS record should preferably have the
file name of the image which is a condition of
the batch import. If not, you can generate a
random id with XSL.
Each MODS record must be well-formed and
valid according to the MODS schema and the
CTDA MODS Guidelines.
16. XSLT
XSLT must begin with the correct declaration.
<?xml version=“1.0” encoding=“UTF-8”?>
<xsl:stylesheet version=“2.0” xmlns:xsl=“http://www.w3.org/1999/XSL/Transform”>
•XSLT consists of one or more set of rules that are called templates.
•XSLT has a number of elements that help manipulate values of selected nodes or elements in
your source XML document.
<xsl:value-of> is used to extract the value of an XML element
<xsl:for-each> is used to select every XML element in a specific node-set
<xsl:choose> is used to express multiple conditional tests
<xsl:result-document> is used to write output to a file or directory.