NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider World: Successful Applications of Linked Data
1. NISO – NFAIS Webinar
www.accessinn.com
www.dataharmony.com
505-998-0800
Marjorie M.K. Hlava
President and Chief Scientist
Access Innovations, Inc.
Linked Data:
Making it a Reality
2. Outline of the talk
Linked data potential
Leveraging the Thesaurus / Taxonomy/
Ontology
Automating the linking
Workflow possibilities
Linked data principles
A few cautions
3. Linked Data: Many definitions
Mash Ups
Live linking from multiple sources
Linking out to external datasets
Linking persistent URIs to datasets
Linked Data Repositories
Defining relationships in RDF triples
Taxonomies, thesauri, ontologies
Triple stores
SKOS or OWL format
4. Authors at a place
MASHUP locations to a
GPS grid of an area
Two data points
GPS Coordinates
Taxonomy description of the place
8. Consider more personnel
at these locations
Two data points
GPS Coordinates
Taxonomy description of the crime
9. Points to Linked Data
Point to relevant resources via URL’s
Leverage the thesaurus for rich ontology
Link to other data repositories
Databases
People nets
Resource files
DBpedia
11. Link to Many Resources
Journal
Article on
Topic A
Other
Journal
Articles on
Topic A
Upcoming
Conference
on Topic A
Podcast Interview
with Researcher
Working on Topic A
Grant Available
for Researchers
Working on
Topic A
CME
Activity on
Topic A
Job Posting
for Expert
on Topic A
12. Selected Article Search “thin film
sputtering”
More Articles on the same topic
Grants available
Upcoming conferences on this topic
Authors working in this space
13. Optics
Definition of the concept
Links to concept pages in other sources
(OSA, SPIE, IOP, AIP, etc.)
Link to Journals that publish on the
subject
People and companies in the space
Optics DBpedia
http://dbpedia.org/page/Optics
Etc.
16. Linking Workflow
Link content to external databank
Make Potential URI matches
QC for the thesaurus domain
Matched URIs enrich the content
17. Linking Workflow
Taxonomy
Term
DBpedia
Potential
Match
Retry?
Add to
Statistics
Report
QC:
Match?
Add Definition
to Thesaurus
SPARQL
Definition
: Query
Add URI to
Thesaurus
SILK Query
NO
YES
Returns URI
18. Phrasing of Concepts will Vary
Exact concept match
add the URI to a field in the thesaurus.
Different phrasing
Research funding “Funding of science”
SILK http://personal.sirma.bg/vladimir/misc/silk-book.
pdf
False matches
Ecosystem engineering vs Ecosystem engineer
19. Automating the Linking
Not every concept will have a match
Or a resource page
Semantic functionality –
Lots of synonyms will help
Proximity and other rules
Create new resources or landing pages
20. Linking Out to External
Datasets
Link Thesaurus Preferred Terms
Resource describing the thesaurus concept
SKOS parlance, is “the same as”
Identify DBpedia pages for each term
Identify other sources
Backfill knowledge gaps
Concept exists
No content pages yet available
25. The Glue
To connect – a communication point
API’s
Application Programming Interface
JDBC, ODBC
Web Calls – Web Services
Data transfer formats
RDF Serialization formats
26. RDF serialization formats
Turtle a compact, human-friendly format.
N-Triples a very simple, easy-to-parse, line-based
format that is not as compact as Turtle.
N-Quads a superset of N-Triples, for serializing
multiple RDF graphs.
JSON-LD a JSON-based serialization.
N3 or Notation 3 a non-standard serialization that is
very similar to Turtle, but has some additional
features, such as the ability to define inference rules.
RDF/XML an XML-based syntax that was the first
standard format for serializing RDF.
27. But What about Triples?
SKOS
Simple Knowledge Organization System
Triples
RDF Statements
Resource Description Format
Subject Object Predicate
OWL
Web Ontology Language
Formats
28. Recursive triple challenges
The Edition is in London
The Edition is a hotel
The book has a second edition
Therefore = The book is a hotel
Margie is a member of NFAIS
NFAIS is in Baltimore
Therefore = Margie is in Baltimore
Need clear disambiguation = thesaurus
29. Metrics – Measuring
Accuracy
The level of accuracy with which we
matched concepts;
How many match correctly?
How many match incorrectly?
The number of concepts with no match
Number of autolink populated pages
31. Two Linked Data Camps
Linked data
Linked OPEN data
Free or security gate
Linking within a collection
Linking with permission
Linking freely on the web
32. Linked Data is about
Using the Web to connect related data that wasn't
previously linked,
Using the Web to lower the barriers to linking data
currently linked using other methods.
A recommended best practice for exposing, sharing,
and connecting pieces of data, information, and
Knowledge
Using URI’s and RDF to create a semantic web
33. Linked Data Principles
Use URIs as names for things
Use HTTP URIs so that people can look
up those names.
When someone looks up a URI, provide
useful information, using the standards
(RDF*, SPARQL)
Include links to other URIs. so that they
can discover more things.
34. The Linked Data Community
W3C standards and working groups
RDF
Linked Open Data Repositories
Dublin Core – DCMI
35. More Buzzwords
FOAF
Subject – Object – Predicate
Graph view – two ends of a link
Deference
Dog food
SPARQL
… its easy to quickly get into the weeds
38. Linked Data Cautions
Never change your URI’s –
It will break the links or maintain a map…
Need persistent identifiers
..SQL indicates a relational database
JAVA & Object Oriented Databases not
broadly supported yet.
Insure that your triples are not recursive
loops
39. It’s What We Do With the Data
The formats will continue to vary
Words will continue to be a challenge
Its what we do with the data that is important.
The delivery
The concepts
Allowing the user to find the thread and follow
it instead of giving them yet another resource
to go to.
40.
41. We covered…
Linked data potential
Leveraging the Thesaurus / Taxonomy/
Ontology
Automating the linking
Linked data principles
A few cautions
Now…
42. It Just Takes
a Little
Imagination
Thank you
Marjorie M.K. Hlava, President
Access Innovations
505-998-0800
mhlava@accessinn.com
43. What we do
Access Innovations
Ensure clean, well formed content
Create Knowledge Organization Systems (KOS)
Data Harmony Tools
To automatically index content
To manage KOS and more
To semantically enrich the content
To organize the content
Access Integrity
Automated Medical Coding Support
43
44. About Access Innovations
Access Innovations are experts in content creation, enrichment, and
conversion services. We provide services to semantically enrich and tag raw
text into highly structured data. We deliver clean, well-formed, metadata-enriched
content so our clients can reuse, repurpose, store, and find their
knowledge assets. We go beyond the standards to build taxonomies and
other data control structures as a solid foundation for your information.
Our services and software allow organizations to use and present their
information to both internal and external constituents by leveraging search,
presentation, e-commerce and linking. We change search to found!
Quick Facts
• Founded in 1978
• Headquartered in Albuquerque, NM
• Privately held
• Delivered more than 2000 engagements
45. Data, Information, Knowledge
Abstraction Interpretation
Data Information Knowledge
Data = height of Mt. Everest
Information = a book on Mt. Everest geological
characteristics
Knowledge = a report containing practical
information on the best way
to reach Mt. Everest's peak
Notas del editor
Thanks to Helen Atkins of AACR for this illustration.
The real power of this is that the links can all go in all directions, so we take advantage of having the user’s attention regardless of how they step into our “web”