SlideShare una empresa de Scribd logo
1 de 20
Using Schematron for Analyzing
Conformance to Best Practices
for EAD, TEI, and MODS
(and some other thoughts on
workflow tools)
Jenn Riley
Metadata Librarian
Indiana University Digital Library Program
Consistency is a challenge
Document-centric XML (TEI, EAD) is
very difficult to create consistently
 Some common tools to help:


◦
◦
◦
◦
◦
◦


Schema/DTD validation
Tag libraries
XML templates
Example documents
Keyboard macros
Detailed encoding guidelines

These are not enough!
7/11/09

ALA 2009 - ALCTS NRMIG

2
Another possible tool layer
Machine validation of a file against
local encoding guidelines
 Can only go so far, but that far is
extremely helpful
 Indiana University implemented using:


◦ Schematron (http://www.schematron.com)
◦ <oXygen />plugin architecture

7/11/09

ALA 2009 - ALCTS NRMIG

3
Inspiration: RLG EAD Report
Card

7/11/09

ALA 2009 - ALCTS NRMIG

4
7/11/09

ALA 2009 - ALCTS NRMIG

5
7/11/09

ALA 2009 - ALCTS NRMIG

6
7/11/09

ALA 2009 - ALCTS NRMIG

7
7/11/09

ALA 2009 - ALCTS NRMIG

8
7/11/09

ALA 2009 - ALCTS NRMIG

9
7/11/09

ALA 2009 - ALCTS NRMIG

10
7/11/09

ALA 2009 - ALCTS NRMIG

11
More info on Schematron
ISO/IEC 19757 - Document Schema
Definition Languages (DSDL) - Part 3:
Rule-based validation – Schematron.
 Be careful! http://www.schematron.com has
the primary specs; http://schematron.com is
for a particular company’s tool using them.
(Weird.)
 This is the page you want:


7/11/09

ALA 2009 - ALCTS NRMIG

12
Using a Schematron file


Schematron home page provides two
distributions:

◦ One for XSLT 1.0 processors and one for 2.0
processors
◦ Each includes a set of three stylesheets to be
used in turn on the Schematron file
◦ Result of this processing is a stylesheet to be
run on your XML instance document

IU implementation wraps this all up into
an <oXygen />plugin written in Java
 You could also pipe them together with a
shell script, a Windows .bat file, etc……


7/11/09

ALA 2009 - ALCTS NRMIG

13
7/11/09

ALA 2009 - ALCTS NRMIG

14
7/11/09

ALA 2009 - ALCTS NRMIG

15
7/11/09

ALA 2009 - ALCTS NRMIG

16
Let’s step back


How can better tools revolutionize
metadata creation workflows?
◦ Promoting consistency
 This is hard and not something that humans
are generally good at

◦ True interoperability between systems
 Without futzing!


We spend too much valuable human time
doing repetitive and low-value tasks as part
of descriptive workflows
7/11/09

ALA 2009 - ALCTS NRMIG

17
Were do we go from here?


Make better use of available
technologies
◦ Automating
◦ Streamlining
◦ Validating



We can and must do our jobs better
and more efficiently, with the help of
better tools
◦ Providing comparable services with less
◦ Creating a convincing argument for more?
7/11/09

ALA 2009 - ALCTS NRMIG

18
There is no excuse for not
having usable metadata creation
tools.


Smart systems are possible and
necessary
◦ Configurable
◦ Modular
◦ Connected



Make it easy to do it well
◦ Consistent
◦ Complete
◦ Efficient




Make it hard to do it poorly
We must pay attention to user interface
design for cataloging tools
7/11/09

ALA 2009 - ALCTS NRMIG

19
OK, rant over. Thank you!


jenlrile@indiana.edu
◦ (watch out for the invisible “l” in the
middle)



Slides and handout:
◦ On ALA presentations Wiki
<http://presentations.ala.org>
◦ On my home page
<http://www.dlib.indiana.edu/~jenlrile/pres
entations/nrmig2009/>
7/11/09

ALA 2009 - ALCTS NRMIG

20

Más contenido relacionado

Similar a Using Schematron for Analyzing Conformance to Best Practices for EAD, TEI, and MODS (and some other thoughts on workflow tools)

0506-django-web-framework-for-python.pdf
0506-django-web-framework-for-python.pdf0506-django-web-framework-for-python.pdf
0506-django-web-framework-for-python.pdf
radhianiedjan1
 
2016_04_04_CNI_Spring_Meeting_Microservices
2016_04_04_CNI_Spring_Meeting_Microservices2016_04_04_CNI_Spring_Meeting_Microservices
2016_04_04_CNI_Spring_Meeting_Microservices
Jason Varghese
 

Similar a Using Schematron for Analyzing Conformance to Best Practices for EAD, TEI, and MODS (and some other thoughts on workflow tools) (20)

ITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
ITARC15 Workshop - Architecting a Large Software Project - Lessons LearnedITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
ITARC15 Workshop - Architecting a Large Software Project - Lessons Learned
 
Product! - The road to production deployment
Product! - The road to production deploymentProduct! - The road to production deployment
Product! - The road to production deployment
 
Data in the Azure Cloud, by Julie Lerman
Data in the Azure Cloud, by Julie LermanData in the Azure Cloud, by Julie Lerman
Data in the Azure Cloud, by Julie Lerman
 
How's relevant JMeter to me - DevConf (Letterkenny)
How's relevant JMeter to me - DevConf (Letterkenny)How's relevant JMeter to me - DevConf (Letterkenny)
How's relevant JMeter to me - DevConf (Letterkenny)
 
Three key concepts for java batch
Three key concepts for java batchThree key concepts for java batch
Three key concepts for java batch
 
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
 
Parsing XML in J2ME
Parsing XML in J2MEParsing XML in J2ME
Parsing XML in J2ME
 
Path to continuous delivery
Path to continuous deliveryPath to continuous delivery
Path to continuous delivery
 
1.INTRODUCTION TO JAVA_2022 MB.ppt .
1.INTRODUCTION TO JAVA_2022 MB.ppt      .1.INTRODUCTION TO JAVA_2022 MB.ppt      .
1.INTRODUCTION TO JAVA_2022 MB.ppt .
 
Workshop: Delivering chnages for applications and databases
Workshop: Delivering chnages for applications and databasesWorkshop: Delivering chnages for applications and databases
Workshop: Delivering chnages for applications and databases
 
0506-django-web-framework-for-python.pdf
0506-django-web-framework-for-python.pdf0506-django-web-framework-for-python.pdf
0506-django-web-framework-for-python.pdf
 
Building a Scalable XML-based Dynamic Delivery Architecture: Standards and Be...
Building a Scalable XML-based Dynamic Delivery Architecture: Standards and Be...Building a Scalable XML-based Dynamic Delivery Architecture: Standards and Be...
Building a Scalable XML-based Dynamic Delivery Architecture: Standards and Be...
 
Liferay portals in real projects
Liferay portals  in real projectsLiferay portals  in real projects
Liferay portals in real projects
 
2016_04_04_CNI_Spring_Meeting_Microservices
2016_04_04_CNI_Spring_Meeting_Microservices2016_04_04_CNI_Spring_Meeting_Microservices
2016_04_04_CNI_Spring_Meeting_Microservices
 
Good practices for debugging Selenium and Appium tests
Good practices for debugging Selenium and Appium testsGood practices for debugging Selenium and Appium tests
Good practices for debugging Selenium and Appium tests
 
Deploy Highly Available and Scalable Storage in Minutes
Deploy Highly Available and Scalable Storage in MinutesDeploy Highly Available and Scalable Storage in Minutes
Deploy Highly Available and Scalable Storage in Minutes
 
Serverless ETL and Optimization on ML pipeline
Serverless ETL and Optimization on ML pipelineServerless ETL and Optimization on ML pipeline
Serverless ETL and Optimization on ML pipeline
 
Ultimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on KubernetesUltimate Guide to Microservice Architecture on Kubernetes
Ultimate Guide to Microservice Architecture on Kubernetes
 
Sharing Best Practices and Recommendations from the Integration Battlefield
Sharing Best Practices and Recommendations from the Integration BattlefieldSharing Best Practices and Recommendations from the Integration Battlefield
Sharing Best Practices and Recommendations from the Integration Battlefield
 
Turbo2018 workshop JIT as a Service
Turbo2018 workshop   JIT as a ServiceTurbo2018 workshop   JIT as a Service
Turbo2018 workshop JIT as a Service
 

Más de Jenn Riley

Más de Jenn Riley (20)

Understanding Metadata: Looking Forward
Understanding Metadata: Looking ForwardUnderstanding Metadata: Looking Forward
Understanding Metadata: Looking Forward
 
The future of cataloguing? Future cataloguers!
The future of cataloguing? Future cataloguers!The future of cataloguing? Future cataloguers!
The future of cataloguing? Future cataloguers!
 
Discovery elsewhere
Discovery elsewhereDiscovery elsewhere
Discovery elsewhere
 
Designing the Garden: Getting Grounded in Linked Data
Designing the Garden: Getting Grounded in Linked DataDesigning the Garden: Getting Grounded in Linked Data
Designing the Garden: Getting Grounded in Linked Data
 
Launching metaware.buzz
Launching metaware.buzzLaunching metaware.buzz
Launching metaware.buzz
 
Getting Comfortable with Metadata Reuse
Getting Comfortable with Metadata ReuseGetting Comfortable with Metadata Reuse
Getting Comfortable with Metadata Reuse
 
Handout for Digital Imaging of Photographs
Handout for Digital Imaging of PhotographsHandout for Digital Imaging of Photographs
Handout for Digital Imaging of Photographs
 
Digital Imaging of Photographs
Digital Imaging of PhotographsDigital Imaging of Photographs
Digital Imaging of Photographs
 
The Open Archives Initiative and the Sheet Music Consortium
The Open Archives Initiative and the Sheet Music ConsortiumThe Open Archives Initiative and the Sheet Music Consortium
The Open Archives Initiative and the Sheet Music Consortium
 
Cushman Exposed! Exploiting Controlled Vocabularies to Enhance Browsing and S...
Cushman Exposed! Exploiting Controlled Vocabularies to Enhance Browsing and S...Cushman Exposed! Exploiting Controlled Vocabularies to Enhance Browsing and S...
Cushman Exposed! Exploiting Controlled Vocabularies to Enhance Browsing and S...
 
Handout for FRBR; or, How I learned to stop worrying and love the model
Handout for FRBR; or, How I learned to stop worrying and love the modelHandout for FRBR; or, How I learned to stop worrying and love the model
Handout for FRBR; or, How I learned to stop worrying and love the model
 
Metadata for Brittle Books Page Turner
Metadata for Brittle Books Page TurnerMetadata for Brittle Books Page Turner
Metadata for Brittle Books Page Turner
 
Digitizing and Delivering Audio and Video
Digitizing and Delivering Audio and VideoDigitizing and Delivering Audio and Video
Digitizing and Delivering Audio and Video
 
Variations2
Variations2Variations2
Variations2
 
Handout for Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
Handout for Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODSHandout for Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
Handout for Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
 
Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODSAlphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
Alphabet Soup: Choosing Among DC, QDC, MARC, MARCXML, and MODS
 
Handout for Merging Metadata from Multiple Traditions: IN Harmony Sheet Music...
Handout for Merging Metadata from Multiple Traditions: IN Harmony Sheet Music...Handout for Merging Metadata from Multiple Traditions: IN Harmony Sheet Music...
Handout for Merging Metadata from Multiple Traditions: IN Harmony Sheet Music...
 
Merging Metadata from Multiple Traditions: IN Harmony Sheet Music from Librar...
Merging Metadata from Multiple Traditions: IN Harmony Sheet Music from Librar...Merging Metadata from Multiple Traditions: IN Harmony Sheet Music from Librar...
Merging Metadata from Multiple Traditions: IN Harmony Sheet Music from Librar...
 
Challenges in the Nursery: Linking a Finding Aid with Online Content
Challenges in the Nursery: Linking a Finding Aid with Online ContentChallenges in the Nursery: Linking a Finding Aid with Online Content
Challenges in the Nursery: Linking a Finding Aid with Online Content
 
Making Interoperability Easier: Creating Shareable Metadata
Making Interoperability Easier: Creating Shareable MetadataMaking Interoperability Easier: Creating Shareable Metadata
Making Interoperability Easier: Creating Shareable Metadata
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 

Using Schematron for Analyzing Conformance to Best Practices for EAD, TEI, and MODS (and some other thoughts on workflow tools)

  • 1. Using Schematron for Analyzing Conformance to Best Practices for EAD, TEI, and MODS (and some other thoughts on workflow tools) Jenn Riley Metadata Librarian Indiana University Digital Library Program
  • 2. Consistency is a challenge Document-centric XML (TEI, EAD) is very difficult to create consistently  Some common tools to help:  ◦ ◦ ◦ ◦ ◦ ◦  Schema/DTD validation Tag libraries XML templates Example documents Keyboard macros Detailed encoding guidelines These are not enough! 7/11/09 ALA 2009 - ALCTS NRMIG 2
  • 3. Another possible tool layer Machine validation of a file against local encoding guidelines  Can only go so far, but that far is extremely helpful  Indiana University implemented using:  ◦ Schematron (http://www.schematron.com) ◦ <oXygen />plugin architecture 7/11/09 ALA 2009 - ALCTS NRMIG 3
  • 4. Inspiration: RLG EAD Report Card 7/11/09 ALA 2009 - ALCTS NRMIG 4
  • 5. 7/11/09 ALA 2009 - ALCTS NRMIG 5
  • 6. 7/11/09 ALA 2009 - ALCTS NRMIG 6
  • 7. 7/11/09 ALA 2009 - ALCTS NRMIG 7
  • 8. 7/11/09 ALA 2009 - ALCTS NRMIG 8
  • 9. 7/11/09 ALA 2009 - ALCTS NRMIG 9
  • 10. 7/11/09 ALA 2009 - ALCTS NRMIG 10
  • 11. 7/11/09 ALA 2009 - ALCTS NRMIG 11
  • 12. More info on Schematron ISO/IEC 19757 - Document Schema Definition Languages (DSDL) - Part 3: Rule-based validation – Schematron.  Be careful! http://www.schematron.com has the primary specs; http://schematron.com is for a particular company’s tool using them. (Weird.)  This is the page you want:  7/11/09 ALA 2009 - ALCTS NRMIG 12
  • 13. Using a Schematron file  Schematron home page provides two distributions: ◦ One for XSLT 1.0 processors and one for 2.0 processors ◦ Each includes a set of three stylesheets to be used in turn on the Schematron file ◦ Result of this processing is a stylesheet to be run on your XML instance document IU implementation wraps this all up into an <oXygen />plugin written in Java  You could also pipe them together with a shell script, a Windows .bat file, etc……  7/11/09 ALA 2009 - ALCTS NRMIG 13
  • 14. 7/11/09 ALA 2009 - ALCTS NRMIG 14
  • 15. 7/11/09 ALA 2009 - ALCTS NRMIG 15
  • 16. 7/11/09 ALA 2009 - ALCTS NRMIG 16
  • 17. Let’s step back  How can better tools revolutionize metadata creation workflows? ◦ Promoting consistency  This is hard and not something that humans are generally good at ◦ True interoperability between systems  Without futzing!  We spend too much valuable human time doing repetitive and low-value tasks as part of descriptive workflows 7/11/09 ALA 2009 - ALCTS NRMIG 17
  • 18. Were do we go from here?  Make better use of available technologies ◦ Automating ◦ Streamlining ◦ Validating  We can and must do our jobs better and more efficiently, with the help of better tools ◦ Providing comparable services with less ◦ Creating a convincing argument for more? 7/11/09 ALA 2009 - ALCTS NRMIG 18
  • 19. There is no excuse for not having usable metadata creation tools.  Smart systems are possible and necessary ◦ Configurable ◦ Modular ◦ Connected  Make it easy to do it well ◦ Consistent ◦ Complete ◦ Efficient   Make it hard to do it poorly We must pay attention to user interface design for cataloging tools 7/11/09 ALA 2009 - ALCTS NRMIG 19
  • 20. OK, rant over. Thank you!  jenlrile@indiana.edu ◦ (watch out for the invisible “l” in the middle)  Slides and handout: ◦ On ALA presentations Wiki <http://presentations.ala.org> ◦ On my home page <http://www.dlib.indiana.edu/~jenlrile/pres entations/nrmig2009/> 7/11/09 ALA 2009 - ALCTS NRMIG 20