SlideShare una empresa de Scribd logo
1 de 74
NISO/DCMI Webinar:
Cooperative Authority Control:
The Virtual International Authority File
(VIAF)
December 4, 2013
Speaker:
Thomas Hickey, Chief Scientist, OCLC
http://www.niso.org/news/events/2013/dcmi/authority
Thomas Hickey
Chief Scientist
2013 December 4
NISO/DCMI Webinar
Cooperative Authority Control:
Virtual International
Authority File (VIAF)
Outline
 Background and Philosophy
 Visible VIAF
 Challenges
 New directions
 Relationship with other identifiers
 Coping with ambiguity
3
Why do we like authorities?
1. To enable a person to find a book of which either
(A) the author
(B) the title
(C) the subject
2. To show what the library has
(D) by a given author
(E) on a given subject
(F) in a given kind of literature
3. To assist in the choice of a book
(G) as to its edition (bibliographically)
(H) as to its character (literary or topical)
is known.
Charles A. Cutter: Rules for a printed dictionary catalog, 1876
What do authority files control?
• Names!
– Persons
– Corporations
– Places
– Uniform Titles
– Families
– Trademarks
– Concepts
But we also control
• Collective authors
• Pseudonyms
• Imaginary characters
• Deities, saints, angels
• Whales, horses, dinosaurs
• Buildings
• Ships, telescopes, space ships, missiles
• Kings, Popes, Presidents
• Cities, lakes, mountains
A changing world
• Libraries
– Local library
– Library consortia
– National cooperation
– Within languages
– Global
• Technology
– Handwritten
– Typed
– Printed
– Online
– Pervasive
EVERYBODY WANTS
TO CHANGE THE WORLD
BUT NOBODY WANTS
TO CHANGE
A world of linked data
http://www.w3.org/DesignIssues/diagrams/lod/2010-color.png
Challenges to libraries
• Reflect these links in our catalogs
– RDA
• Link to external resources
• Have non-library resources link to us
– Promote our links
• Be integrated in our users workflow
Library data is
• Trusted
• Understood
• Reasonably interoperable
• Complex
Within the community, linked data of limited help
Shareable metadata
• Public
• Simple
• Supply data rather than APIs
– Avoid idiosyncratic protocols
• Z39.50
• MARC-21
• ISO2709
12
Brief history of VIAF
13
VIAF
Proof-of-concept
project launched
1998
VIAF
Consortium
formed
(Berlin)
2003 2007
•Library of
Congress
•Die Deutsche
Bibliothek
•OCLC Research
2011
After considering
multiple options,
consensus to
transition VIAF to
an OCLC service
BnF
joins
VIAF
becomes an
OCLC service
2012
VIAF Council holds
1st meeting
(Helsinki)
4 Principals
+
18 Contributors
in
18 countries
VIAF’s Goals
 Reduce cost of authority control
 Increase the utility of library authority files
 Provide links between equivalent names
 Make the information Web friendly
 Open API
 Bulk downloads
 Open Linked Data
14
Applications
 FRBR matching
 Better matching of non-English metadata
 Uniform identifier across all languages
 Authority control for cataloging
 Better regionalization of catalogs
 Minimize differences across languages of
cataloging
More intelligent linking and searching
VIAF authority record counts
17
26,400,000
5,100,000
400,000
1,800,000
Personal
Corporate
Geographic
Uniform Titles
Web interface and usage
18
VIAF Use
22
Usage
• Browser usage for past year
– 953,020 visitors
– 1,531,493
– 5,448,910 pages
• API usage
– Went from 90% of usage to 98%
– Peaks at ~20/second
– ~ 5 million searches/week
• Downloads
– ~150/week for links, 150 for clusters
23
24
25
Building VIAF
26
Enhancing authorities
Bibliographic
Record
Derived
Authority
Authority
Record
Processed
Authority
Record Flow
• 37 million authority records
• 30 million links between authorities
SWNL Bib & Authority BnF Bib & Authority LC Bib & Authority
VIAF
Machine access to VIAF
Background
 VIAF is available in bulk downloads
 All online interaction with VIAF is RESTful
Using SRU
 http://www.loc.gov/standards/sru/
 http://www.oclc.org/developer/documentation/virtual-
international-authority-file-viaf/using-api
Bulk downloads
 Go to http://viaf.org/viaf/data
 Variety of formats
 Just links
 RDF (XML and N-Triples)
 MARC-21
 Native XML clusters
SRU
 Search/Retrieve via URLs
 http://viaf.org/viaf/search?query=dempsey
 http://viaf.org/viaf/search?query=local.names+all
+dempsey&sortKeys=holdingscount
 http://viaf.org/viaf/search?query=local.names+all
+cervantes+and+local.sources+any+%22bnc+b
ne%22&sortKeys=holdingscount
SRU Tricks
 RSS feed
http://viaf.org/viaf/search?query=dempsey&http:accept=
application/rss%2bxml
 Exact with truncation
http://viaf.org/viaf/search?query=local.names+exact+%2
2cervantes*%22&sortKeys=holdingscount
http://viaf.org/viaf/search
URL Patterns
 http://viaf.org/viaf/95216565
 http://viaf.org/viaf/sourceID/BNF%7C11926133
 http://viaf.org/viaf/sourceID/LC%7Cn++79130807
 http://viaf.org/viaf/95216565/viaf.xml
 http://viaf.org/viaf/95216565/justlinks.json
 http://viaf.org/viaf/95216565/marc21.xml
 http://viaf.org/viaf/95216565/rdf.xml
New Directions for VIAF
36
 Non-library sources
 Information from WorldCat
 Integration with WorldCat
VIAFbot – The Wikipedia Connection
VIAFbot
http://www.flickr.com/photos/vintagehalloweencollector/4808568
25/
 OCLC Wikipedian in residence
Max Klein
 Automatic comparison of VIAF
and Wikipedia references
 Initially English then German
 Now working with WikiData
WikiData
39
WikiData
40
WikiData
41
WikiData
VIAF↔Wikidata Linking Benefits
VIAF Enhancing Wikipedia language coverage
14,000+
New labels/aliases added
VIAF – in the Web of
Bibliographic Data
Worldcat.org/oclc/81453459
The Hidden Face of Eve
http://viaf.org/viaf/84254254/
Nawal El Saadawi
http://www.wikidata.org/wiki/Q238514
Nawal El Saadawi
http://isni-url.oclc.nl/isni/0000000120296695
Nawal El Saadawi
author
sameAs
sameAs
sameAs
http://id.loc.gov/authorities/subjects/sh85120576
The Sex customs
about
VIAF
Other non-library sources
• ISNI
– International Standard Name Identifier
• Perseus Digital Library
• Syriac project names
• Fihirst Arabic names
44
Information from WorldCat
45
Multilingual Bibliographic
Structure Project
 Majority of WorldCat about non-English works
 Much of the metadata is non-English
 Hybrid records
 Parallel records
 FRBR work-level algorithm plus GLIMIR
manifestation/expression level
 Identify 3 levels of FRBR
 Can’t we do something with these?
46
Approach
• Process at work-level when possible
• Extract most reliable information
• Use that to extract less reliable
• Find
– Languages, original language
– Translators
– Titles (by language)
47
Benefits
• Localize metadata to various languages
– Easier cataloging
– Better cataloging
• Merge
• Fix
– Better displays to fit the user
• Linking of translations
• Appropriate language
• Use all appropriate data!
• Better FRBR groupings
48
Records for VIAF
• Translated works
– Work and expression records
– More information about
• Languages
• Translators
– Better links between work/expression records
49
Other possibilities
• Variant forms of names
• More titles
• Coauthors
• FAST subject headings
50
Identifier relationships
51
ISNI
International Standard Name Identifier
 Draft ISO standard:
… aspires to provide a means to uniquely identify creators, including
authors, composers, artists, cartographers and performers, among
others. Such an authoritative identifier will serve to provide a link for
occurrences of the identity across databases on the web
 Driven by rights-holders
 Publishers
 Rights agencies representing authors, artists
 Active disambiguation program
 Started with Thomson-Reuter’s Researcher ID
 Most ‘social’
 Claiming IDs
 Interactive verification of associated works
 Pulling together several current initiatives
 Driven by STM, university communities
 Primarily interested in researchers
 Large number of participants
 Mostly concerned with present and future names
Cooperation Challenges
 What data can be shared?
 How to fund the efforts?
 Established by different types of institutions:
 Libraries, Standards Organization, STM Publishers
 Different
 Technologies
 Time scales
 What does the name represent?
 People, personas, organizations
 Who is in charge?
Commonalities
 All centered in not-for-profits
 All interested in data exchange
 All interested in global systems
 All have an understanding of the problem
 Personal author disambiguation and identification
 Central to their operations
Coping with Ambiguity
1,520 headings found for smith, john
The problem
 Two names in single source for same identity
 Mixed identities
 Different granularity
 Pseudonyms
 Presidents, Kings
 Chains of matches
 VIAF has ~ ½ million ambiguous groups
Goal
• 99+% sure of pair-wise assertions
– Includes all pairs of records in resulting clusters
Another common issue
59
Harvest and ingest
 Coping with
– Duplicate identifiers
– Deletes
Matching Authorities to Bibs
 Sometimes identifier
 Often ambiguity with just names
 Multiple possibilities
 May mix and identity
Cross references within sources
 Strings can be ambiguous
 Links not necessarily resolvable
Enhance the authority records
• Pull information from bibs, authority notes
• Cope with
– Mistagged fields
– Ambiguous dates
– Errors in pulling titles, etc.
Pair-wise matching between sources
• Two dozen types of matches
– Ranked by reliability/strength
• Major problems
– Missing information
– Mixed identities
• Can override the matching
– xA
Duplicates within sources
• Rely primarily on
– String similarity
– Complexity of the preferred form
• Also look for multiple links from other sources
• Lonely names
Pulling together groups
• Only keep strongest links between records in
different sources
– A record in source A may match several records in
source B
– E.g. keep a double-date match over a coauthor match
Generate coherent clusters
• Look for cliques
• Merge subgraphs
o Strength of the best link between the pair
o Number of links between the pair
o A metric based on
 Strength of the match
 Title closeness
 Node type (corporate, personal, etc.)
 Name closeness
o Whether the nodes are personal names or not
Coherent clusters
• Avoid
 Date conflicts
 Incompatible names
 Names that are cross references to each other
 Names that differ only in a number
Assign VIAF IDs
 Minimize moves of source records
 Redirect unused VIAF IDs if possible
Create links between clusters
• Cross references
• Uniform titles
• Coauthors
• Other bibliographic titles
In general, link only if not ambiguity
Lonely names
71
©2013 OCLC. This work is licensed under a Creative Commons Attribution 3.0 Unported License. Suggested attribution: “This
work uses content from [presentation title] © OCLC, used under a Creative Commons Attribution license:
http://creativecommons.org/licenses/by/3.0/”
Thank You!
72
NISO/DCMI Webinar
Cooperative Authority Control: The Virtual International Authority
File (VIAF)
NISO/DCMI Webinar • December 4, 2013
Questions?
All questions will be posted with presenter answers on
the NISO website following the webinar:
http://www.niso.org/news/events/2013/dcmi/authority
Thank you for joining us today.
Please take a moment to fill out the brief online survey.
We look forward to hearing from you!
THANK YOU

Más contenido relacionado

La actualidad más candente

Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
 
It19 20140721 linked data personal perspective
It19 20140721 linked data personal perspectiveIt19 20140721 linked data personal perspective
It19 20140721 linked data personal perspectiveJanifer Gatenby
 
Lecture linked data cloud & sparql
Lecture linked data cloud & sparqlLecture linked data cloud & sparql
Lecture linked data cloud & sparqlDhavalkumar Thakker
 
Entification: The Route to 'Useful' Library Data
Entification: The Route to 'Useful' Library DataEntification: The Route to 'Useful' Library Data
Entification: The Route to 'Useful' Library DataRichard Wallis
 
Designing Linked Data Software & Services for Libraries
Designing Linked Data Software & Services for LibrariesDesigning Linked Data Software & Services for Libraries
Designing Linked Data Software & Services for LibrariesRichard Wallis
 
Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1 Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1 Richard Urban
 
Better Search With Structured Knowledge
Better Search With Structured KnowledgeBetter Search With Structured Knowledge
Better Search With Structured KnowledgeMichel Dumontier
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and ResearchCorrib.org - OpenSource and Research
Corrib.org - OpenSource and Researchadameq
 
Linked Open Data and Digital Curation (Islandora)
Linked Open Data and Digital Curation (Islandora)Linked Open Data and Digital Curation (Islandora)
Linked Open Data and Digital Curation (Islandora)Hong (Jenny) Jing
 
Linked data for Ebook discovery
Linked data for Ebook discoveryLinked data for Ebook discovery
Linked data for Ebook discoveryRichard Wallis
 
Linked Data in Libraries
Linked Data in LibrariesLinked Data in Libraries
Linked Data in LibrariesCarl Hess
 
Islandora and Linked Open Data
Islandora and Linked Open Data Islandora and Linked Open Data
Islandora and Linked Open Data eohallor
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsJon Voss
 
Contributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library DataContributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library DataMarcia Zeng
 
4.2.15 Slides, “Hydra: many heads, many connections. Enriching Fedora Reposit...
4.2.15 Slides, “Hydra: many heads, many connections. Enriching Fedora Reposit...4.2.15 Slides, “Hydra: many heads, many connections. Enriching Fedora Reposit...
4.2.15 Slides, “Hydra: many heads, many connections. Enriching Fedora Reposit...DuraSpace
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked DataEUCLID project
 

La actualidad más candente (20)

NISO/DCMI Webinar: Metadata for Managing Scientific Research Data
NISO/DCMI Webinar: Metadata for Managing Scientific Research DataNISO/DCMI Webinar: Metadata for Managing Scientific Research Data
NISO/DCMI Webinar: Metadata for Managing Scientific Research Data
 
Thompson 6-jun15-final
Thompson 6-jun15-finalThompson 6-jun15-final
Thompson 6-jun15-final
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
It19 20140721 linked data personal perspective
It19 20140721 linked data personal perspectiveIt19 20140721 linked data personal perspective
It19 20140721 linked data personal perspective
 
Lecture linked data cloud & sparql
Lecture linked data cloud & sparqlLecture linked data cloud & sparql
Lecture linked data cloud & sparql
 
Library Linked Data and the Future of Bibliographic Control
Library Linked Data and the Future of Bibliographic ControlLibrary Linked Data and the Future of Bibliographic Control
Library Linked Data and the Future of Bibliographic Control
 
Entification: The Route to 'Useful' Library Data
Entification: The Route to 'Useful' Library DataEntification: The Route to 'Useful' Library Data
Entification: The Route to 'Useful' Library Data
 
Designing Linked Data Software & Services for Libraries
Designing Linked Data Software & Services for LibrariesDesigning Linked Data Software & Services for Libraries
Designing Linked Data Software & Services for Libraries
 
Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1 Publishing and Using Linked Open Data - Day 1
Publishing and Using Linked Open Data - Day 1
 
Better Search With Structured Knowledge
Better Search With Structured KnowledgeBetter Search With Structured Knowledge
Better Search With Structured Knowledge
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and ResearchCorrib.org - OpenSource and Research
Corrib.org - OpenSource and Research
 
Linked Open Data and Digital Curation (Islandora)
Linked Open Data and Digital Curation (Islandora)Linked Open Data and Digital Curation (Islandora)
Linked Open Data and Digital Curation (Islandora)
 
Linked data for Ebook discovery
Linked data for Ebook discoveryLinked data for Ebook discovery
Linked data for Ebook discovery
 
Linked Data in Libraries
Linked Data in LibrariesLinked Data in Libraries
Linked Data in Libraries
 
Islandora and Linked Open Data
Islandora and Linked Open Data Islandora and Linked Open Data
Islandora and Linked Open Data
 
Intro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & MuseumsIntro to Linked Open Data in Libraries, Archives & Museums
Intro to Linked Open Data in Libraries, Archives & Museums
 
Contributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library DataContributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library Data
 
4.2.15 Slides, “Hydra: many heads, many connections. Enriching Fedora Reposit...
4.2.15 Slides, “Hydra: many heads, many connections. Enriching Fedora Reposit...4.2.15 Slides, “Hydra: many heads, many connections. Enriching Fedora Reposit...
4.2.15 Slides, “Hydra: many heads, many connections. Enriching Fedora Reposit...
 
Extending Schema.org
Extending Schema.orgExtending Schema.org
Extending Schema.org
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 

Similar a NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International Authority File (VIAF)

Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge GraphsPeter Haase
 
Porting Library Vocabularies to the Semantic Web - IFLA 2010
Porting Library Vocabularies to the Semantic Web - IFLA 2010Porting Library Vocabularies to the Semantic Web - IFLA 2010
Porting Library Vocabularies to the Semantic Web - IFLA 2010Bernard Vatant
 
ORCID Use Cases from the CDL
ORCID Use Cases from the CDLORCID Use Cases from the CDL
ORCID Use Cases from the CDLLisa Schiff
 
Linked Data and cultural heritage data: an overview of the approaches from Eu...
Linked Data and cultural heritage data: an overview of the approaches from Eu...Linked Data and cultural heritage data: an overview of the approaches from Eu...
Linked Data and cultural heritage data: an overview of the approaches from Eu...The European Library
 
Beyond the catalogue : BibFrame, Linked Data and Ending the Invisible Library
Beyond the catalogue : BibFrame, Linked Data and Ending the 	Invisible LibraryBeyond the catalogue : BibFrame, Linked Data and Ending the 	Invisible Library
Beyond the catalogue : BibFrame, Linked Data and Ending the Invisible LibraryKsenija Mincic Obradovic
 
New member webinar 052418
New member webinar 052418New member webinar 052418
New member webinar 052418Crossref
 
New member
New member New member
New member Crossref
 
Semantic Web Technologies: Changing Bibliographic Descriptions?
Semantic Web Technologies: Changing Bibliographic Descriptions?Semantic Web Technologies: Changing Bibliographic Descriptions?
Semantic Web Technologies: Changing Bibliographic Descriptions?Stuart Weibel
 
BISG DOI Overview
BISG DOI OverviewBISG DOI Overview
BISG DOI OverviewCrossref
 
In other words...: Using multiple taxonimies
In other words...: Using multiple taxonimiesIn other words...: Using multiple taxonimies
In other words...: Using multiple taxonimieskramsey
 
HathiTrust--a GovDocs Repository?
HathiTrust--a GovDocs Repository?HathiTrust--a GovDocs Repository?
HathiTrust--a GovDocs Repository?Brian Vetruba
 
Publishing data on the Semantic Web
Publishing data on the Semantic WebPublishing data on the Semantic Web
Publishing data on the Semantic WebPeter Mika
 
Anila Angjeli. "ISNI & VIAF" Presentation at the Workshop on Persistent Ident...
Anila Angjeli. "ISNI & VIAF" Presentation at the Workshop on Persistent Ident...Anila Angjeli. "ISNI & VIAF" Presentation at the Workshop on Persistent Ident...
Anila Angjeli. "ISNI & VIAF" Presentation at the Workshop on Persistent Ident...AnilaAngjeli
 
Digital Library Infrastructure for a Million Books
Digital Library Infrastructure for a Million BooksDigital Library Infrastructure for a Million Books
Digital Library Infrastructure for a Million BooksSteve Toub
 
Changing Data: Implementing Primo for the Tri University Group of Libraries (...
Changing Data: Implementing Primo for the Tri University Group of Libraries (...Changing Data: Implementing Primo for the Tri University Group of Libraries (...
Changing Data: Implementing Primo for the Tri University Group of Libraries (...Alison Hitchens
 
Steps for research process
Steps for research processSteps for research process
Steps for research processMira
 

Similar a NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International Authority File (VIAF) (20)

EDS for IFLA
EDS for IFLAEDS for IFLA
EDS for IFLA
 
EDS for JIBS
EDS for JIBSEDS for JIBS
EDS for JIBS
 
Getting Started with Knowledge Graphs
Getting Started with Knowledge GraphsGetting Started with Knowledge Graphs
Getting Started with Knowledge Graphs
 
Porting Library Vocabularies to the Semantic Web - IFLA 2010
Porting Library Vocabularies to the Semantic Web - IFLA 2010Porting Library Vocabularies to the Semantic Web - IFLA 2010
Porting Library Vocabularies to the Semantic Web - IFLA 2010
 
ORCID Use Cases from the CDL
ORCID Use Cases from the CDLORCID Use Cases from the CDL
ORCID Use Cases from the CDL
 
Linked Data and cultural heritage data: an overview of the approaches from Eu...
Linked Data and cultural heritage data: an overview of the approaches from Eu...Linked Data and cultural heritage data: an overview of the approaches from Eu...
Linked Data and cultural heritage data: an overview of the approaches from Eu...
 
Beyond the catalogue : BibFrame, Linked Data and Ending the Invisible Library
Beyond the catalogue : BibFrame, Linked Data and Ending the 	Invisible LibraryBeyond the catalogue : BibFrame, Linked Data and Ending the 	Invisible Library
Beyond the catalogue : BibFrame, Linked Data and Ending the Invisible Library
 
New member webinar 052418
New member webinar 052418New member webinar 052418
New member webinar 052418
 
New member
New member New member
New member
 
Semantic Web Technologies: Changing Bibliographic Descriptions?
Semantic Web Technologies: Changing Bibliographic Descriptions?Semantic Web Technologies: Changing Bibliographic Descriptions?
Semantic Web Technologies: Changing Bibliographic Descriptions?
 
BISG DOI Overview
BISG DOI OverviewBISG DOI Overview
BISG DOI Overview
 
In other words...: Using multiple taxonimies
In other words...: Using multiple taxonimiesIn other words...: Using multiple taxonimies
In other words...: Using multiple taxonimies
 
HathiTrust--a GovDocs Repository?
HathiTrust--a GovDocs Repository?HathiTrust--a GovDocs Repository?
HathiTrust--a GovDocs Repository?
 
Soc 111 Xiaoping
Soc 111 XiaopingSoc 111 Xiaoping
Soc 111 Xiaoping
 
Publishing data on the Semantic Web
Publishing data on the Semantic WebPublishing data on the Semantic Web
Publishing data on the Semantic Web
 
Snac webinar v3
Snac webinar v3Snac webinar v3
Snac webinar v3
 
Anila Angjeli. "ISNI & VIAF" Presentation at the Workshop on Persistent Ident...
Anila Angjeli. "ISNI & VIAF" Presentation at the Workshop on Persistent Ident...Anila Angjeli. "ISNI & VIAF" Presentation at the Workshop on Persistent Ident...
Anila Angjeli. "ISNI & VIAF" Presentation at the Workshop on Persistent Ident...
 
Digital Library Infrastructure for a Million Books
Digital Library Infrastructure for a Million BooksDigital Library Infrastructure for a Million Books
Digital Library Infrastructure for a Million Books
 
Changing Data: Implementing Primo for the Tri University Group of Libraries (...
Changing Data: Implementing Primo for the Tri University Group of Libraries (...Changing Data: Implementing Primo for the Tri University Group of Libraries (...
Changing Data: Implementing Primo for the Tri University Group of Libraries (...
 
Steps for research process
Steps for research processSteps for research process
Steps for research process
 

Más de National Information Standards Organization (NISO)

Más de National Information Standards Organization (NISO) (20)

Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
 
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
 
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
 
Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"
 
Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"
 
Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"
 
Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"
Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"
Ratner "Enhancing Open Science: Assessing Tools & Charting Progress"
 
Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"
Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"
Pfeiffer "Enhancing Open Science: Assessing Tools & Charting Progress"
 

Último

4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptxmary850239
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
Mental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsMental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsPooky Knightsmith
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxSayali Powar
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataBabyAnnMotar
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptxDhatriParmar
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 

Último (20)

prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx4.11.24 Poverty and Inequality in America.pptx
4.11.24 Poverty and Inequality in America.pptx
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
Mental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young mindsMental Health Awareness - a toolkit for supporting young minds
Mental Health Awareness - a toolkit for supporting young minds
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptxBIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
BIOCHEMISTRY-CARBOHYDRATE METABOLISM CHAPTER 2.pptx
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
Measures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped dataMeasures of Position DECILES for ungrouped data
Measures of Position DECILES for ungrouped data
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
Unraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptxUnraveling Hypertext_ Analyzing  Postmodern Elements in  Literature.pptx
Unraveling Hypertext_ Analyzing Postmodern Elements in Literature.pptx
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 

NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International Authority File (VIAF)

  • 1. NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International Authority File (VIAF) December 4, 2013 Speaker: Thomas Hickey, Chief Scientist, OCLC http://www.niso.org/news/events/2013/dcmi/authority
  • 2. Thomas Hickey Chief Scientist 2013 December 4 NISO/DCMI Webinar Cooperative Authority Control: Virtual International Authority File (VIAF)
  • 3. Outline  Background and Philosophy  Visible VIAF  Challenges  New directions  Relationship with other identifiers  Coping with ambiguity 3
  • 4. Why do we like authorities? 1. To enable a person to find a book of which either (A) the author (B) the title (C) the subject 2. To show what the library has (D) by a given author (E) on a given subject (F) in a given kind of literature 3. To assist in the choice of a book (G) as to its edition (bibliographically) (H) as to its character (literary or topical) is known. Charles A. Cutter: Rules for a printed dictionary catalog, 1876
  • 5. What do authority files control? • Names! – Persons – Corporations – Places – Uniform Titles – Families – Trademarks – Concepts
  • 6. But we also control • Collective authors • Pseudonyms • Imaginary characters • Deities, saints, angels • Whales, horses, dinosaurs • Buildings • Ships, telescopes, space ships, missiles • Kings, Popes, Presidents • Cities, lakes, mountains
  • 7. A changing world • Libraries – Local library – Library consortia – National cooperation – Within languages – Global • Technology – Handwritten – Typed – Printed – Online – Pervasive EVERYBODY WANTS TO CHANGE THE WORLD BUT NOBODY WANTS TO CHANGE
  • 8. A world of linked data http://www.w3.org/DesignIssues/diagrams/lod/2010-color.png
  • 9.
  • 10. Challenges to libraries • Reflect these links in our catalogs – RDA • Link to external resources • Have non-library resources link to us – Promote our links • Be integrated in our users workflow
  • 11. Library data is • Trusted • Understood • Reasonably interoperable • Complex Within the community, linked data of limited help
  • 12. Shareable metadata • Public • Simple • Supply data rather than APIs – Avoid idiosyncratic protocols • Z39.50 • MARC-21 • ISO2709 12
  • 13. Brief history of VIAF 13 VIAF Proof-of-concept project launched 1998 VIAF Consortium formed (Berlin) 2003 2007 •Library of Congress •Die Deutsche Bibliothek •OCLC Research 2011 After considering multiple options, consensus to transition VIAF to an OCLC service BnF joins VIAF becomes an OCLC service 2012 VIAF Council holds 1st meeting (Helsinki) 4 Principals + 18 Contributors in 18 countries
  • 14. VIAF’s Goals  Reduce cost of authority control  Increase the utility of library authority files  Provide links between equivalent names  Make the information Web friendly  Open API  Bulk downloads  Open Linked Data 14
  • 15. Applications  FRBR matching  Better matching of non-English metadata  Uniform identifier across all languages  Authority control for cataloging  Better regionalization of catalogs  Minimize differences across languages of cataloging More intelligent linking and searching
  • 16.
  • 17. VIAF authority record counts 17 26,400,000 5,100,000 400,000 1,800,000 Personal Corporate Geographic Uniform Titles
  • 18. Web interface and usage 18
  • 19.
  • 20.
  • 21.
  • 23. Usage • Browser usage for past year – 953,020 visitors – 1,531,493 – 5,448,910 pages • API usage – Went from 90% of usage to 98% – Peaks at ~20/second – ~ 5 million searches/week • Downloads – ~150/week for links, 150 for clusters 23
  • 24. 24
  • 25. 25
  • 28. Record Flow • 37 million authority records • 30 million links between authorities SWNL Bib & Authority BnF Bib & Authority LC Bib & Authority VIAF
  • 30. Background  VIAF is available in bulk downloads  All online interaction with VIAF is RESTful Using SRU  http://www.loc.gov/standards/sru/  http://www.oclc.org/developer/documentation/virtual- international-authority-file-viaf/using-api
  • 31. Bulk downloads  Go to http://viaf.org/viaf/data  Variety of formats  Just links  RDF (XML and N-Triples)  MARC-21  Native XML clusters
  • 32. SRU  Search/Retrieve via URLs  http://viaf.org/viaf/search?query=dempsey  http://viaf.org/viaf/search?query=local.names+all +dempsey&sortKeys=holdingscount  http://viaf.org/viaf/search?query=local.names+all +cervantes+and+local.sources+any+%22bnc+b ne%22&sortKeys=holdingscount
  • 33. SRU Tricks  RSS feed http://viaf.org/viaf/search?query=dempsey&http:accept= application/rss%2bxml  Exact with truncation http://viaf.org/viaf/search?query=local.names+exact+%2 2cervantes*%22&sortKeys=holdingscount
  • 35. URL Patterns  http://viaf.org/viaf/95216565  http://viaf.org/viaf/sourceID/BNF%7C11926133  http://viaf.org/viaf/sourceID/LC%7Cn++79130807  http://viaf.org/viaf/95216565/viaf.xml  http://viaf.org/viaf/95216565/justlinks.json  http://viaf.org/viaf/95216565/marc21.xml  http://viaf.org/viaf/95216565/rdf.xml
  • 36. New Directions for VIAF 36  Non-library sources  Information from WorldCat  Integration with WorldCat
  • 37. VIAFbot – The Wikipedia Connection VIAFbot http://www.flickr.com/photos/vintagehalloweencollector/4808568 25/  OCLC Wikipedian in residence Max Klein  Automatic comparison of VIAF and Wikipedia references  Initially English then German  Now working with WikiData
  • 42. VIAF↔Wikidata Linking Benefits VIAF Enhancing Wikipedia language coverage 14,000+ New labels/aliases added
  • 43. VIAF – in the Web of Bibliographic Data Worldcat.org/oclc/81453459 The Hidden Face of Eve http://viaf.org/viaf/84254254/ Nawal El Saadawi http://www.wikidata.org/wiki/Q238514 Nawal El Saadawi http://isni-url.oclc.nl/isni/0000000120296695 Nawal El Saadawi author sameAs sameAs sameAs http://id.loc.gov/authorities/subjects/sh85120576 The Sex customs about VIAF
  • 44. Other non-library sources • ISNI – International Standard Name Identifier • Perseus Digital Library • Syriac project names • Fihirst Arabic names 44
  • 46. Multilingual Bibliographic Structure Project  Majority of WorldCat about non-English works  Much of the metadata is non-English  Hybrid records  Parallel records  FRBR work-level algorithm plus GLIMIR manifestation/expression level  Identify 3 levels of FRBR  Can’t we do something with these? 46
  • 47. Approach • Process at work-level when possible • Extract most reliable information • Use that to extract less reliable • Find – Languages, original language – Translators – Titles (by language) 47
  • 48. Benefits • Localize metadata to various languages – Easier cataloging – Better cataloging • Merge • Fix – Better displays to fit the user • Linking of translations • Appropriate language • Use all appropriate data! • Better FRBR groupings 48
  • 49. Records for VIAF • Translated works – Work and expression records – More information about • Languages • Translators – Better links between work/expression records 49
  • 50. Other possibilities • Variant forms of names • More titles • Coauthors • FAST subject headings 50
  • 52. ISNI International Standard Name Identifier  Draft ISO standard: … aspires to provide a means to uniquely identify creators, including authors, composers, artists, cartographers and performers, among others. Such an authoritative identifier will serve to provide a link for occurrences of the identity across databases on the web  Driven by rights-holders  Publishers  Rights agencies representing authors, artists  Active disambiguation program
  • 53.  Started with Thomson-Reuter’s Researcher ID  Most ‘social’  Claiming IDs  Interactive verification of associated works  Pulling together several current initiatives  Driven by STM, university communities  Primarily interested in researchers  Large number of participants  Mostly concerned with present and future names
  • 54. Cooperation Challenges  What data can be shared?  How to fund the efforts?  Established by different types of institutions:  Libraries, Standards Organization, STM Publishers  Different  Technologies  Time scales  What does the name represent?  People, personas, organizations  Who is in charge?
  • 55. Commonalities  All centered in not-for-profits  All interested in data exchange  All interested in global systems  All have an understanding of the problem  Personal author disambiguation and identification  Central to their operations
  • 56. Coping with Ambiguity 1,520 headings found for smith, john
  • 57. The problem  Two names in single source for same identity  Mixed identities  Different granularity  Pseudonyms  Presidents, Kings  Chains of matches  VIAF has ~ ½ million ambiguous groups
  • 58. Goal • 99+% sure of pair-wise assertions – Includes all pairs of records in resulting clusters
  • 60. Harvest and ingest  Coping with – Duplicate identifiers – Deletes
  • 61. Matching Authorities to Bibs  Sometimes identifier  Often ambiguity with just names  Multiple possibilities  May mix and identity
  • 62. Cross references within sources  Strings can be ambiguous  Links not necessarily resolvable
  • 63. Enhance the authority records • Pull information from bibs, authority notes • Cope with – Mistagged fields – Ambiguous dates – Errors in pulling titles, etc.
  • 64. Pair-wise matching between sources • Two dozen types of matches – Ranked by reliability/strength • Major problems – Missing information – Mixed identities • Can override the matching – xA
  • 65. Duplicates within sources • Rely primarily on – String similarity – Complexity of the preferred form • Also look for multiple links from other sources • Lonely names
  • 66. Pulling together groups • Only keep strongest links between records in different sources – A record in source A may match several records in source B – E.g. keep a double-date match over a coauthor match
  • 67. Generate coherent clusters • Look for cliques • Merge subgraphs o Strength of the best link between the pair o Number of links between the pair o A metric based on  Strength of the match  Title closeness  Node type (corporate, personal, etc.)  Name closeness o Whether the nodes are personal names or not
  • 68. Coherent clusters • Avoid  Date conflicts  Incompatible names  Names that are cross references to each other  Names that differ only in a number
  • 69. Assign VIAF IDs  Minimize moves of source records  Redirect unused VIAF IDs if possible
  • 70. Create links between clusters • Cross references • Uniform titles • Coauthors • Other bibliographic titles In general, link only if not ambiguity
  • 72. ©2013 OCLC. This work is licensed under a Creative Commons Attribution 3.0 Unported License. Suggested attribution: “This work uses content from [presentation title] © OCLC, used under a Creative Commons Attribution license: http://creativecommons.org/licenses/by/3.0/” Thank You! 72
  • 73. NISO/DCMI Webinar Cooperative Authority Control: The Virtual International Authority File (VIAF) NISO/DCMI Webinar • December 4, 2013 Questions? All questions will be posted with presenter answers on the NISO website following the webinar: http://www.niso.org/news/events/2013/dcmi/authority
  • 74. Thank you for joining us today. Please take a moment to fill out the brief online survey. We look forward to hearing from you! THANK YOU

Notas del editor

  1. I’ve been involved in VIAF for more than 10 years, at first leading the implementation, then later the project, and more recently helping transition it to production Need to thank Ted Fons, Executive Director in charge of many things, including VIAF, some of whose slides I’ve borrowed
  2. And I’ve got a couple of other topics we could cover, depending on the time.
  3. Why? Why to make our catalogs work! To collocate things together. These were put forth when librarians had little more than ink and paper, when card catalogs were a new technology, but the basic principles remain valid (possibly with some modification of what is meant by ‘the library’) Of course, the world is changing and different trends push catalogs in slightly different directions, but I think it will always be desirable to be able to tell what an author has written and be able to understand relationships between various expressions of a work
  4. The only thing here out of scope for VIAF is ‘concepts’’, although I don’t think we have and trademarks yet.
  5. Carolyn Keene, author of Nancy Drew books is an example of a collective author VIAF is mainly people, corporations, jurisdictional geographics, works and expressions With the RDA related changes for moving fictional characters into names file in NACO VIAF can expect to get many more of these classes
  6. Change is hard! Parts of our collections have become more remote. When compared to Web pages, even our digitized material can seem hard to access. The catalog are, if anything, have become more important, even if the catalog the user encounters is part of Google. Remote access to many things, including our catalogs has not just be come instant, but available anywhere. VIAF is part of making libraries work at the global level
  7. You’ve probably seen this. This was done some 3 years ago and would –much- bigger now. Libraries are not a big part of it, but at least we are in it.
  8. VIAF supported linked data early, so it made the chart, but linked data is all about clear links and libraries have more to contribute to this than is usually recognized.
  9. At OCLC we call promoting our links ‘syndication’
  10. We do have some advantages Trust, etc. Within the community, ‘linked data’ of limited help. We already have mechanisms, although they could be used to a greater extent. If we want to share metadata, however, out traditional approaches do not work well
  11. It needs to be shareable between communities, which is a real challenge. Lately OCLC has been working with Schema.org
  12. Virtual International Authority File history: http://www.oclc.org/viaf/history.en.html
  13. Contributors: 34 Agencies 28 Countries 23 National libraries +10 via consortia 15 Agencies joined in 2012-1013
  14. In addition to ~ 34 million authorities, also process ~107 million bibliographic records to find titles of resources, coauthors, publication dates, publishers, etc.
  15. 98% of VIAF access is viaf the API, but the Web browser/HTML presence is still important
  16. Here is a typical late morning view of VIAF through Google Analytics. ~ 50-60 users We get quite a bit of use from S. America even thought we don’t have a source coming from there. Asian use is growing, but comes in at a different time of day.
  17. After doubling and tripling each year, we are seeing modest growth in browser usage 5 million searches/week is 8/second 24 hours a day Links file is ~350 Mbytes, cluster files ~6 Gbytes
  18. Some use from all over
  19. Nearly 2/3 from Europe (again this is browser use) Germany and France vie for most usage, although Switzerland might have highest per capita use
  20. Just a little bit about how the VIAF database is build
  21. 27
  22. T. Hickey
  23. As mentioned, well over 95% of VIAF’s access is non-browser Lots of ‘bots harvesting, plus people pulling information into other applications
  24. If you are going to look at a lot of records it can be faster to pull the whole file down. Updated monthly Exactly what VIAF runs from
  25. The last restricts the search to the BNE (National Library of Spain). We also have an abbreviation for all of Spain that includes the National Library of Catalonia.
  26. Shows all the indexes, how to browse All URL based REST services
  27. Wikipedia. Relatively small (300K) names, but big impact on usefulness of VIAF First step was to harvest the English Wikipedia, find names, match them to VIAF and add Wikipedia links Then Max (after much discussion with Wikipedians) pushed VIAF IDs into the English version (with reference to the German which had many VIAF links) We then made Wikipedia a ‘source’ to VIAF so that its information can affect VIAF clustering VIAF data then pushed into WikiData Making VIAF links available to all language editions of Wikipedia Eventually we expect to work directly with Wikidata
  28. Just add Perseus to VIAF Syriac is an old middle-eastern language still in use Fihirst is an Oxford-Cambridge project to control ancient Arabic names VIAF was the ‘seed’ database to help get ISNI started. We continue to exchange data
  29. VIAF has long pulled some of its bibliographic records out of WorldCat, but until now we have not ‘mined’ WorldCat to otherwise enrich VIAF
  30. Janifer Gatenby’s vision has become a cross-divisional project at OCLC
  31. Should be able to fix many missing/wrong language assignments http://www.slideshare.net/JaniferGatenby/multilingual-presentation-ifla-2013-0819
  32. Should greatly expand VIAF’s coverage of important works and expressions
  33. There is lots of information in WorldCat that would enhance VIAF displays PAUSE HERE FOR QUESTIONS? Next slide is Identifier Relationships
  34. VIAF is in a somewhat crowded field. There are at least two other major international identifier systems under development: ISNI and ORCID. OCLC is involved with both
  35. Wants to provide an identifier for all sorts of creators to form a link across databases on the web
  36. ORCID may offer the best chance for institutional repositories to quickly get an ID for use in their systems. You can expect to see ORCIDs on published material soon
  37. Historical reasons, including differing policies and procedures in different communities Who creates the names? Who can update and maintain them? Who is in charge? Who sets priorities, resolves issues, sets standards? Systems Organization, revenue Not just between the projects themselves, but also challenges in cooperation within the projects. We have worked through some of these issues in VIAF. Person: Eleanor Marie Robertson Personas: Nora Roberts, J.D. Robb, Jill March, Sarah Hardesty
  38. But all three of the projects have a striking number of important things in common and I’m optimistic about cooperation ANOTHER PAUSE? Next slide starts Coping with Ambiguity
  39. Even if we look for clusters with the exact name ‘smith, john’ we find 67 of them, and 824 for ‘smith, john*’
  40. Even though the NL of Australia and Royal L of the Netherlands are not directly connected, VIAF is claiming a link there and those need to be 99+ % correct
  41. Incompatible names, e.g. Name and Name’s spirit Names with number: John V vs. John VI
  42. Always
  43. Just starting to look at these ISNI has concept of unique names Sometimes these get more complicated with multiple records from one or more sources NEXT SLIDE IS THE END!