SlideShare una empresa de Scribd logo
1 de 151
http://www.niso.org/news/events/2013/webinars/linked_data

NISO Webinar:
Library Linked Data:
From Vision to Reality
December 11, 2013

Speakers:
Jon Voss - Strategic Partnerships Director, We Are What We Do
Matt Miller - Front End Developer, NYPL Labs at the New York Public Library
Silvia Southwick - Digital Collections Metadata Librarian, UNLV University
Libraries
Cory Lampert - Head, Digital Collections , UNLV University Libraries
Linked Jazz
Revealing the
relationships of the
jazz community

Matt Miller
@thisismmiller
December 2013
Project Overview
• Investigating the application of Linked Open Data
to enhance the discovery and visibility of digital
cultural heritage materials.
• Build new methods of connecting cultural data.
• Uncover meaningful connections between
documents and data related to the personal and
professional lives of musicians who often practice
in rich and diverse social networks.
Professor Cristina Pattuelli at the Pratt Institute School of Library Information Science is the
director of the project which began in 2011.
Linked Data Now!
Why?
• Bootstrap your project with existing data.
• Highlights knowledge you have created and
knowledge that is missing.
• Facilitates sharing, but also growing your own
project.
Bootstrapping – Identifying

Research Question
How can we discover and analyze
the rich and diverse network of
relationships between jazz
musicians?

Primary Sources
Oral history interview transcripts
of jazz musicians.
Bootstrapping – Identifying

Research Question
How can we discover and analyze
the rich and diverse network of
relationships between jazz
musicians?

Primary Sources
Oral history interview transcripts
of jazz musicians.

We need to know the names
(and variants) of jazz
musicians in a structured
controlled vocabulary.
Bootstrapping – Identifying

Charlie Parker
Many different LOD datasets contain this
information. We need to access, query and link it
for only jazz related individuals.
Bootstrapping – Querying
Bootstrapping – Querying
• Processing the DBpedia dataset resulted in around
9,000 URIs.
– DBpedia is fluid! After each release (currently 3.9) we
reprocess the files resulting in the addition of 500-700
URIs.

• We now have a name directory, but we want
additional forms of personal names. To accomplish
this we try mapping to Library of Congress.
• Matching DBpedia and LC URIs is not automatic.
Bootstrapping – Mapping
• We matched identities based on:
• Name
• Life Dates
• White listed words found in sources
(http://www.loc.gov/mads/rdf/v1#Source)
• Reconciling authorities is difficult!
• Use others work: http://viaf.org/viaf/data/
• But don’t discount your own processes.
• Using our relatively simple process we
were able to match about 1500 more URIs
than VIAF.org.
• This is due to a smaller domain (jazz).
Our name directory creation and authority
matching is documented:
https://github.com/thisismattmiller/linkedjazz-name-directory
Bootstrapping – Curating

http://linkedjazz.org/public_demo_mapping/
Bootstrapping – Review
• Start small, think big.
– Specific subject domain.
– Large infrastructure not required (triple stores, etc.)
• Can get started with extract files and python scripting.

• Reuse as much as possible, but try new processes
leveraging domain specificity.
• Always be curating, use tools to facilitate process but
a human hand is often required.
Applying the Data
• Use the name directory to locate individuals in
the interview transcript.
• This project phase involves 50 transcripts.
• Because the names are tied to URIs we can
infer a relationship triple between two
individuals.
<foaf:Person> <rel:knowsOf> <foaf:Person>
Applying the Data
Transcript Analyzer
Transcript Analyzer
• An interface to curate the transcripts and verify
detected names.
• Implements off the shelf NLP (NLTK) to attempt
to locate additional names not in our directory as
well as corporate names and locations.
• Global rule system, as we process more
transcripts the system is being trained.
• Using URIs to represent entities we can quickly
see where we are discovering new material.
– 50 Transcripts
• 1800 person entities tagged.
• 250 names tagged without authoritative URI.
– Knowledge Creation
New Dataset
• We have created a new LOD dataset now of
jazz musician’s relationships.
• Our next steps are:
– Visualize.
– Further qualify the rel:knowsOf relationships.
– Provide access to the data created.
Visualize

http://linkedjazz.org/network/
Qualify Relationships – 52nd St.
• Recruit jazz experts and enthusiasts to help
categorize relationships based on transcript
text.
• We use existing vocabularies to build the data
set: Foaf, Relationship Vocabulary, Music
Ontology
• The interface is critical for crowdsourcing tools,
we work with user experience experts and
conduct user studies to refine our public facing
tools
Qualify Relationships – 52nd St.

http://linkedjazz.org/52ndStreet/
Provide Access
• We provide a SPARQL endpoint.
• But also a traditional API:
– http://linkedjazz.org/api/
– Can return:
• JSON
• N-Triples
• Gephi graph files (GXEF)
Learn and Grow as a Team
• Experience through doing.
• Empower graduate
students with skills and
practical experience
working with a LOD
project.
• Use the project as a
vehicle to make intra- and
inter-intuitional
collaborations.

Linked Jazz Team July 2013
Next Steps
• Refactor our prototype tools into sustainable open
source projects.
• Redesign 52nd St. based on user study groups.
• Work on emerging collaborations with Jazz Centers.
Thanks!

http://www.linkedjazz.org
Linked, Exposed Data: UNLV
Linked Data Project
NISO Webinar: Library Linked Data: From Vision to
Reality
December 11, 2013

Silvia B. Southwick
Digital Collections Metadata Librarian
UNLV Libraries

Cory K. Lampert
Head, Digital Collections
UNLV Libraries
Agenda
•
•
•
•
•
•

Motivation
Environment
UNLV Linked Data project
Technologies
Transforming metadata into linked data
Next steps
How it Started
•
•
•
•

Conferences and “buzz”
Curiousity and professional development
Exploration and pilot project
Compelling results; sharing impact of what
we’ve learned
• Assessment
• Much more to do...
Current Practice
• Data (or metadata) encapsulated in records
• Records contained in collections
• Very few links are created within and/or across
collections
• Links have to be manually created
• Existing links do not specify the nature of the
relationships among records
This structure hides potential links within and
across collections
What we can do with linked data
•
•
•
•
•
•

Free data from silos
Expose relationships
Powerful, seamless, interlinking of our data
Users interact or query data in new ways
Search results would be more precise
Data can be easily repurposed
Making the Case for Linked Data in
Academic Library Digital Collections
– Problem: Rich metadata is being lost in dumbed down
DC records
– Issue: Investment and resource allocation (Item-level
philosophy)
– Goal: Increased: exposure, collaboration, and
openness

– Outcome: Increased discovery and user-focus
Gaining Buy In
Administration
• Innovative project, high impact
• Pilot, experiment, learn by doing, share results
Staff
• We already have the metadata; We need to
transform them into triples
• Managing change
Graphical Representation: One
Record
Examples of records
December 12,
1915

title
Implications (Internal)
• Cross-unit collaboration is necessary
• Staff expertise will evolve
• Staff roles will change to accommodate new /
parallel workflow
• Data clean-up will be an investment
• Management of data becomes critical
• Discovery issues = user interfaces still need
development
Implications (External)
• Publish data from our collections in the Linked
Data Cloud to improve discoverability and
connections with other related data sets on
the Web
• Sharing data in new ways with new partners
may raise new issues
• Need to engage with linked data community
for technologies, tools, best practices, and to
demand library vendor support for LOD.
UNLV Linked Data Project
Goals:
• Study the feasibility of developing a common
process that would allow the conversion of our
collection records into linked data preserving
their original expressivity and richness
• Publish data from our collections in the Linked
Data Cloud to improve discoverability and
connections with other related data sets on the
Web
PROJECT IMPLEMENTATION
Actions
Prepare data
Export data

Import data
Clean data
Reconcile
Generate
triples
Export RDF

Import data
Publish

Technologies
CONTENTdm

Open Refine

Mulgara /
Virtuoso
Prepare / Export Data
Technology: CONTENTdm
• Increase consistency across collections:
– metadata element labels
– use of CV, share local CVs
– etc.

• Export data as spreadsheet
Create mapping between metadata elements and
EDM model predicates
OpenRefine
• Open source
• It is a server – can communicate with other
datasets via http

• Open Refine and its RDF extension should be
installed
Screenshots to show some of the functions we have
used
OpenRefine first screen
Facets
Split multi-value cells
Facet view for
Graphic Elements
after splitting
Reconciliation
Specifying Reconciliation service
Activating Reconciliation
Creating a Skeleton
Exporting RDF files
Actions
Prepare data
Export data

Import data
Clean data
Reconcile
Generate
triples
Export RDF

Import data
Publish
Query

Technologies
CONTENTdm

Open Refine

Mulgara /
Virtuoso
Mulgara Triple Store: Import
A simple SPARQL query

Select *
where
{ ?s ?p ?o} limit
100
SPARQL: Querying Data

• Using Virtuoso PivotViewer
Query
Costume Design
Drawings

Showgirls
Next steps for the UNLV project
• Transform all digital collections into linked data
(parallel structure)
• Increase linkage with other datasets
• Design interfaces to access and display our data
and related data from other datasets
• Evaluate alternative interfaces from user’s
perspective
• Produce a cost benefit analysis to inform future
plans for the development of digital collections
Thank You!
Questions?
NISO Webinar:
Library Linked Data: From Vision to Reality

Questions?
All questions will be posted with presenter answers on
the NISO website following the webinar:
http://www.niso.org/news/events/2013/webinars/linked_data

NISO Webinar • December 11, 2013
THANK YOU
Thank you for joining us today.
Please take a moment to fill out the brief online survey.
We look forward to hearing from you!

Más contenido relacionado

La actualidad más candente

NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...
NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...
NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...
National Information Standards Organization (NISO)
 
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
National Information Standards Organization (NISO)
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
Anja Jentzsch
 
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons LearnedWWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
Stefan Dietze
 
The Impact of Linked Data in Digital Curation and Application to the Catalogu...
The Impact of Linked Data in Digital Curation and Application to the Catalogu...The Impact of Linked Data in Digital Curation and Application to the Catalogu...
The Impact of Linked Data in Digital Curation and Application to the Catalogu...
Hong (Jenny) Jing
 

La actualidad más candente (20)

Danbri Drupalcon Export
Danbri Drupalcon ExportDanbri Drupalcon Export
Danbri Drupalcon Export
 
NISO Virtual Conference: The Semantic Web Coming of Age: Technologies and Imp...
NISO Virtual Conference: The Semantic Web Coming of Age: Technologies and Imp...NISO Virtual Conference: The Semantic Web Coming of Age: Technologies and Imp...
NISO Virtual Conference: The Semantic Web Coming of Age: Technologies and Imp...
 
Linked Open Data in Romania
Linked Open Data in RomaniaLinked Open Data in Romania
Linked Open Data in Romania
 
Oct 15 NISO Webinar: 21st Century Resource Sharing: Which Inter-Library Loan ...
Oct 15 NISO Webinar: 21st Century Resource Sharing: Which Inter-Library Loan ...Oct 15 NISO Webinar: 21st Century Resource Sharing: Which Inter-Library Loan ...
Oct 15 NISO Webinar: 21st Century Resource Sharing: Which Inter-Library Loan ...
 
NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...
NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...
NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...
 
Metadata Training for Staff and Librarians for the New Data Environment
Metadata Training for Staff and Librarians for the New Data EnvironmentMetadata Training for Staff and Librarians for the New Data Environment
Metadata Training for Staff and Librarians for the New Data Environment
 
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
 
Thompson 6-jun15-final
Thompson 6-jun15-finalThompson 6-jun15-final
Thompson 6-jun15-final
 
IFLA LIDASIG Open Session 2017: Introduction to Linked Data
IFLA LIDASIG Open Session 2017: Introduction to Linked DataIFLA LIDASIG Open Session 2017: Introduction to Linked Data
IFLA LIDASIG Open Session 2017: Introduction to Linked Data
 
Hansen-2-jun15
Hansen-2-jun15Hansen-2-jun15
Hansen-2-jun15
 
Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)Linked Data (1st Linked Data Meetup Malmö)
Linked Data (1st Linked Data Meetup Malmö)
 
Linked Open Data and Digital Curation (Islandora)
Linked Open Data and Digital Curation (Islandora)Linked Open Data and Digital Curation (Islandora)
Linked Open Data and Digital Curation (Islandora)
 
Introduction to the Semantic Web
Introduction to the Semantic WebIntroduction to the Semantic Web
Introduction to the Semantic Web
 
Digital Libraries of the Future: Use of Semantic Web and Social Bookmarking t...
Digital Libraries of the Future: Use of Semantic Web and Social Bookmarking t...Digital Libraries of the Future: Use of Semantic Web and Social Bookmarking t...
Digital Libraries of the Future: Use of Semantic Web and Social Bookmarking t...
 
NISO Webinar: Back From the Endangered List: Using Authority Data to Enhance ...
NISO Webinar: Back From the Endangered List: Using Authority Data to Enhance ...NISO Webinar: Back From the Endangered List: Using Authority Data to Enhance ...
NISO Webinar: Back From the Endangered List: Using Authority Data to Enhance ...
 
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons LearnedWWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
WWW2014 Tutorial: Online Learning & Linked Data - Lessons Learned
 
NISO DCMI Webinar bibframe-20130123
NISO DCMI Webinar bibframe-20130123NISO DCMI Webinar bibframe-20130123
NISO DCMI Webinar bibframe-20130123
 
NISO Webinar: The Future of Integrated Library Systems PART 2: User Interaction
NISO Webinar: The Future of Integrated Library Systems PART 2: User InteractionNISO Webinar: The Future of Integrated Library Systems PART 2: User Interaction
NISO Webinar: The Future of Integrated Library Systems PART 2: User Interaction
 
The Impact of Linked Data in Digital Curation and Application to the Catalogu...
The Impact of Linked Data in Digital Curation and Application to the Catalogu...The Impact of Linked Data in Digital Curation and Application to the Catalogu...
The Impact of Linked Data in Digital Curation and Application to the Catalogu...
 
Brief State of the Art - Semantic Web technologies for geospatial data - Mode...
Brief State of the Art - Semantic Web technologies for geospatial data - Mode...Brief State of the Art - Semantic Web technologies for geospatial data - Mode...
Brief State of the Art - Semantic Web technologies for geospatial data - Mode...
 

Similar a NISO Webinar: Library Linked Data: From Vision to Reality

Linked data and the future of libraries
Linked data and the future of librariesLinked data and the future of libraries
Linked data and the future of libraries
Regan Harper
 
Global lodlam_communities and open cultural data
Global lodlam_communities and open cultural dataGlobal lodlam_communities and open cultural data
Global lodlam_communities and open cultural data
Minerva Lin
 
The Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
The Canadian Linked Data Initiative: Charting a Path to a Linked Data FutureThe Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
The Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
NASIG
 

Similar a NISO Webinar: Library Linked Data: From Vision to Reality (20)

Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)
 
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Inter...
 
Linked data and the future of libraries
Linked data and the future of librariesLinked data and the future of libraries
Linked data and the future of libraries
 
Linked Data at the OU - the story so far
Linked Data at the OU - the story so farLinked Data at the OU - the story so far
Linked Data at the OU - the story so far
 
Linked (Open) Data
Linked (Open) DataLinked (Open) Data
Linked (Open) Data
 
Alamw15 VIVO
Alamw15 VIVOAlamw15 VIVO
Alamw15 VIVO
 
Linked data 20171106
Linked data 20171106Linked data 20171106
Linked data 20171106
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the Software
 
Metadata, Open Access and More: Crossref presentation
Metadata, Open Access and More: Crossref presentationMetadata, Open Access and More: Crossref presentation
Metadata, Open Access and More: Crossref presentation
 
Global lodlam_communities and open cultural data
Global lodlam_communities and open cultural dataGlobal lodlam_communities and open cultural data
Global lodlam_communities and open cultural data
 
Southwickc lampert lodlam_training
Southwickc lampert lodlam_trainingSouthwickc lampert lodlam_training
Southwickc lampert lodlam_training
 
An A+ Plan to Transform Your Library with Linked Data
An A+ Plan to Transform Your Library with Linked DataAn A+ Plan to Transform Your Library with Linked Data
An A+ Plan to Transform Your Library with Linked Data
 
Back to the Future: The Reinvention of the Library Catalog, Yesterday, Today,...
Back to the Future: The Reinvention of the Library Catalog, Yesterday, Today,...Back to the Future: The Reinvention of the Library Catalog, Yesterday, Today,...
Back to the Future: The Reinvention of the Library Catalog, Yesterday, Today,...
 
Linked Open Data for Cultural Heritage
Linked Open Data for Cultural HeritageLinked Open Data for Cultural Heritage
Linked Open Data for Cultural Heritage
 
Hide the Stack: Toward Usable Linked Data
Hide the Stack:Toward Usable Linked DataHide the Stack:Toward Usable Linked Data
Hide the Stack: Toward Usable Linked Data
 
Marc and beyond: 3 Linked Data Choices
 Marc and beyond: 3 Linked Data Choices  Marc and beyond: 3 Linked Data Choices
Marc and beyond: 3 Linked Data Choices
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & Museums
 
The Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
The Canadian Linked Data Initiative: Charting a Path to a Linked Data FutureThe Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
The Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
 
Describing Theses and Dissertations Using Schema.org
Describing Theses and Dissertations Using Schema.orgDescribing Theses and Dissertations Using Schema.org
Describing Theses and Dissertations Using Schema.org
 
Linked Data
Linked DataLinked Data
Linked Data
 

Más de National Information Standards Organization (NISO)

Más de National Information Standards Organization (NISO) (20)

Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
 
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
 
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
 
Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"
 
Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"
 
Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"
 

Último

The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
fonyou31
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Krashi Coaching
 

Último (20)

fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 

NISO Webinar: Library Linked Data: From Vision to Reality

  • 1. http://www.niso.org/news/events/2013/webinars/linked_data NISO Webinar: Library Linked Data: From Vision to Reality December 11, 2013 Speakers: Jon Voss - Strategic Partnerships Director, We Are What We Do Matt Miller - Front End Developer, NYPL Labs at the New York Public Library Silvia Southwick - Digital Collections Metadata Librarian, UNLV University Libraries Cory Lampert - Head, Digital Collections , UNLV University Libraries
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.
  • 80. Linked Jazz Revealing the relationships of the jazz community Matt Miller @thisismmiller December 2013
  • 81. Project Overview • Investigating the application of Linked Open Data to enhance the discovery and visibility of digital cultural heritage materials. • Build new methods of connecting cultural data. • Uncover meaningful connections between documents and data related to the personal and professional lives of musicians who often practice in rich and diverse social networks. Professor Cristina Pattuelli at the Pratt Institute School of Library Information Science is the director of the project which began in 2011.
  • 82. Linked Data Now! Why? • Bootstrap your project with existing data. • Highlights knowledge you have created and knowledge that is missing. • Facilitates sharing, but also growing your own project.
  • 83. Bootstrapping – Identifying Research Question How can we discover and analyze the rich and diverse network of relationships between jazz musicians? Primary Sources Oral history interview transcripts of jazz musicians.
  • 84. Bootstrapping – Identifying Research Question How can we discover and analyze the rich and diverse network of relationships between jazz musicians? Primary Sources Oral history interview transcripts of jazz musicians. We need to know the names (and variants) of jazz musicians in a structured controlled vocabulary.
  • 85. Bootstrapping – Identifying Charlie Parker Many different LOD datasets contain this information. We need to access, query and link it for only jazz related individuals.
  • 87. Bootstrapping – Querying • Processing the DBpedia dataset resulted in around 9,000 URIs. – DBpedia is fluid! After each release (currently 3.9) we reprocess the files resulting in the addition of 500-700 URIs. • We now have a name directory, but we want additional forms of personal names. To accomplish this we try mapping to Library of Congress. • Matching DBpedia and LC URIs is not automatic.
  • 88. Bootstrapping – Mapping • We matched identities based on: • Name • Life Dates • White listed words found in sources (http://www.loc.gov/mads/rdf/v1#Source) • Reconciling authorities is difficult! • Use others work: http://viaf.org/viaf/data/ • But don’t discount your own processes. • Using our relatively simple process we were able to match about 1500 more URIs than VIAF.org. • This is due to a smaller domain (jazz). Our name directory creation and authority matching is documented: https://github.com/thisismattmiller/linkedjazz-name-directory
  • 90. Bootstrapping – Review • Start small, think big. – Specific subject domain. – Large infrastructure not required (triple stores, etc.) • Can get started with extract files and python scripting. • Reuse as much as possible, but try new processes leveraging domain specificity. • Always be curating, use tools to facilitate process but a human hand is often required.
  • 91. Applying the Data • Use the name directory to locate individuals in the interview transcript. • This project phase involves 50 transcripts. • Because the names are tied to URIs we can infer a relationship triple between two individuals. <foaf:Person> <rel:knowsOf> <foaf:Person>
  • 94. Transcript Analyzer • An interface to curate the transcripts and verify detected names. • Implements off the shelf NLP (NLTK) to attempt to locate additional names not in our directory as well as corporate names and locations. • Global rule system, as we process more transcripts the system is being trained. • Using URIs to represent entities we can quickly see where we are discovering new material. – 50 Transcripts • 1800 person entities tagged. • 250 names tagged without authoritative URI. – Knowledge Creation
  • 95. New Dataset • We have created a new LOD dataset now of jazz musician’s relationships. • Our next steps are: – Visualize. – Further qualify the rel:knowsOf relationships. – Provide access to the data created.
  • 97. Qualify Relationships – 52nd St. • Recruit jazz experts and enthusiasts to help categorize relationships based on transcript text. • We use existing vocabularies to build the data set: Foaf, Relationship Vocabulary, Music Ontology • The interface is critical for crowdsourcing tools, we work with user experience experts and conduct user studies to refine our public facing tools
  • 98. Qualify Relationships – 52nd St. http://linkedjazz.org/52ndStreet/
  • 99. Provide Access • We provide a SPARQL endpoint. • But also a traditional API: – http://linkedjazz.org/api/ – Can return: • JSON • N-Triples • Gephi graph files (GXEF)
  • 100. Learn and Grow as a Team • Experience through doing. • Empower graduate students with skills and practical experience working with a LOD project. • Use the project as a vehicle to make intra- and inter-intuitional collaborations. Linked Jazz Team July 2013
  • 101. Next Steps • Refactor our prototype tools into sustainable open source projects. • Redesign 52nd St. based on user study groups. • Work on emerging collaborations with Jazz Centers.
  • 103. Linked, Exposed Data: UNLV Linked Data Project NISO Webinar: Library Linked Data: From Vision to Reality December 11, 2013 Silvia B. Southwick Digital Collections Metadata Librarian UNLV Libraries Cory K. Lampert Head, Digital Collections UNLV Libraries
  • 104. Agenda • • • • • • Motivation Environment UNLV Linked Data project Technologies Transforming metadata into linked data Next steps
  • 105. How it Started • • • • Conferences and “buzz” Curiousity and professional development Exploration and pilot project Compelling results; sharing impact of what we’ve learned • Assessment • Much more to do...
  • 106. Current Practice • Data (or metadata) encapsulated in records • Records contained in collections • Very few links are created within and/or across collections • Links have to be manually created • Existing links do not specify the nature of the relationships among records This structure hides potential links within and across collections
  • 107. What we can do with linked data • • • • • • Free data from silos Expose relationships Powerful, seamless, interlinking of our data Users interact or query data in new ways Search results would be more precise Data can be easily repurposed
  • 108. Making the Case for Linked Data in Academic Library Digital Collections – Problem: Rich metadata is being lost in dumbed down DC records – Issue: Investment and resource allocation (Item-level philosophy) – Goal: Increased: exposure, collaboration, and openness – Outcome: Increased discovery and user-focus
  • 109. Gaining Buy In Administration • Innovative project, high impact • Pilot, experiment, learn by doing, share results Staff • We already have the metadata; We need to transform them into triples • Managing change
  • 113. Implications (Internal) • Cross-unit collaboration is necessary • Staff expertise will evolve • Staff roles will change to accommodate new / parallel workflow • Data clean-up will be an investment • Management of data becomes critical • Discovery issues = user interfaces still need development
  • 114. Implications (External) • Publish data from our collections in the Linked Data Cloud to improve discoverability and connections with other related data sets on the Web • Sharing data in new ways with new partners may raise new issues • Need to engage with linked data community for technologies, tools, best practices, and to demand library vendor support for LOD.
  • 115. UNLV Linked Data Project Goals: • Study the feasibility of developing a common process that would allow the conversion of our collection records into linked data preserving their original expressivity and richness • Publish data from our collections in the Linked Data Cloud to improve discoverability and connections with other related data sets on the Web
  • 117. Actions Prepare data Export data Import data Clean data Reconcile Generate triples Export RDF Import data Publish Technologies CONTENTdm Open Refine Mulgara / Virtuoso
  • 118. Prepare / Export Data Technology: CONTENTdm • Increase consistency across collections: – metadata element labels – use of CV, share local CVs – etc. • Export data as spreadsheet Create mapping between metadata elements and EDM model predicates
  • 119. OpenRefine • Open source • It is a server – can communicate with other datasets via http • Open Refine and its RDF extension should be installed Screenshots to show some of the functions we have used
  • 121.
  • 122. Facets
  • 123.
  • 125.
  • 126.
  • 127. Facet view for Graphic Elements after splitting
  • 131.
  • 132.
  • 133.
  • 135.
  • 136.
  • 138. Actions Prepare data Export data Import data Clean data Reconcile Generate triples Export RDF Import data Publish Query Technologies CONTENTdm Open Refine Mulgara / Virtuoso
  • 140. A simple SPARQL query Select * where { ?s ?p ?o} limit 100
  • 141.
  • 142. SPARQL: Querying Data • Using Virtuoso PivotViewer
  • 144.
  • 145.
  • 146.
  • 147.
  • 148. Next steps for the UNLV project • Transform all digital collections into linked data (parallel structure) • Increase linkage with other datasets • Design interfaces to access and display our data and related data from other datasets • Evaluate alternative interfaces from user’s perspective • Produce a cost benefit analysis to inform future plans for the development of digital collections
  • 150. NISO Webinar: Library Linked Data: From Vision to Reality Questions? All questions will be posted with presenter answers on the NISO website following the webinar: http://www.niso.org/news/events/2013/webinars/linked_data NISO Webinar • December 11, 2013
  • 151. THANK YOU Thank you for joining us today. Please take a moment to fill out the brief online survey. We look forward to hearing from you!

Notas del editor

  1. Introduce SelfNYPL RoleLinked Jazz Role as a developer, a practical look at how we used linked data in our project and why
  2. Guiding principle of the project is to develop practical and applicable ways to use linked data.Not some far off ideal, there are practical applications of it right now.
  3. The General layout of a project, research questions and primary documents
  4. Data remediation still required.