This document provides an overview of the International Standard Name Identifier (ISNI) database and system. It discusses the sources that contribute data to ISNI, key statistics about assigned ISNIs and links, how to search the database through various methods, quality control processes, linkages to the Virtual International Authority File (VIAF) system, and several organizational uses of ISNI identifiers.
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
ISNI Database and Interoperability
1. cross-domain bridging-domains
Libraries
Text Rights
Trade Sources Music Rights
Encyclopaedias
Researchers & Professional
Granting organisations
Professional Societies
Article databases
Theses databases
Archives and
Museums
Harvard University Library 2014-11-18
2. ISNI: Where are we now?
• ISNI database and system
• Sources and figures
• Enquiry, API, Linked data
• end user input,
• notification system,
• VIAF database and system
• VIAF interoperability,
• ISNI Quality Infrastructure
• Assignment Agency
• QT interoperability,
• Some usages
• La Trobe,
• ICLA,
• Iconoclaste
• ORCID technical
cooperation.
• OCLC Research Task
Forces Support
ISNI at Harvard
18th November 2014
Janifer Gatenby
OCLC EMEA
Harvard University Library 2014-11-18
3. Status at November 2014
• 8.69 million assigned ISNIs (was 1 million 2 years ago)
• 21.38 million links; ISNI persistent URI
Databases Assigned Links
Research 12 2,553,949 6,464,362
Text rights 8 130,771 723,625
Music 5 316,187 463, 288
Libraries & trade 3 7.45 million 13.6 million
Organisations 3 446, 258 109,211
ISNI database and system
VIAF interoperability
Quality
Usages
Harvard University Library 2014-11-18
4. Current ISNI Sources …and growing
GENERAL SOURCES
Bowker Books in Print BOWKER
The European Library (48 national
libraries)
TEL
Virtual International Authority File (33
libraries)
VIAF
RIGHTS MANAGEMENT
Access Copyright, Canada ACCE
Authors’ Guild AGLD
Authors’ Licensing and Collecting
Society, UK
ALCS
Centrum Dienstverlening Auteurs- en
aanverwante Rechten, Netherlands
CEDA
Centro Español de Derechos
Reprográficos
CEDR
Irish Copyright Licensing Agency ICLA
Prolitteris, Switzerland PROL
VG WORT, Germany VGWO
MUSIC
American Musicological Society AMS
British Library Sound Archive BLSA
International Performers’ Database
Association
IPDA
MusicBrainz MUBZ
RESEARCHERS AND PROFESSIONALS
American Musicological Society AMS
British Library Theses BRTH
Digital Author identifier, Netherlands DAI
Jisc Names Project, UK JNAM
La Trobe University AU:VLU
Modern Languages Association MLA
OCLC Theses OCLCT
ORCID and DataCite Interoperability
Network
ODIN
AuthorClaim and RePec OPENL
Proquest Theses PROQ
Scholar Universe, Proquest SCHU
Electronic tables of content ZETO
ORGANISATIONS
Boekenbank, Belgium BOEK
Bowker Publishers BOWP
Publishers Licensing Society, UK PLS
Ringgold RING
ISNI database and system
VIAF interoperability
Quality
Usages
Harvard University Library 2014-11-18
5. Assigned
8.69 million
Provisional: Possible
700,815
Provisional: Unassigned
9,287, 278
ISNI database and system
VIAF interoperability
Assigned ISNIs November 2014
VIAF + non VIAF sources 4,870,099
3+ VIAF sources 428,988
2+ sources (not VIAF) 315,915
Unique name 2,735,449
Trusted single source (JISC,
BOEK, RING) 342,231
Total 8,692,683
Authoritative,
Unique,
Trustful,
Persistent
8.69 million persons
446,258 organisations
Quality
Usages
+ % confidence
- % confidence
Harvard University Library 2014-11-18
6. Assigned
8.69 million
Provisional: Possible
700,815
Provisional: Unassigned
9,287, 278
Authoritative,
Unique,
Trustful,
Persistent
Public
View assigned
SRU access
Persistent URL
End user input
ISNI database and system
VIAF interoperability
Quality
Usages
Member / RAG
View all
SRU API
Persistent URL
Maintenance
Merge
Enrichment
Online
assignment API
Harvard University Library 2014-11-18
7. Searching indexes
ISNI database and system
VIAF interoperability
Examples
seedocument
your code and > update date December 2013
Cn: ams & upd: > 201312
Your code and another’s code
Cn: jnam & cn: proq
Name Keyword not your code
Nw: trobe not cn: auvlu
Almost anything can be indexed
Also available by SRU API
See document ISNI search guidelines.doc
http://www.isni.org/content/documents-related-database-enquiry
Quality
Usages
Harvard University Library 2014-11-18
8. Browse ISNI database and system
VIAF interoperability
Qulaity
Usages
Harvard University Library 2014-11-18
9. Search by SRU API
See Document:
ISNI SRU search API guidelines.doc
Example search by name keyword (pica.nw):
ISNI database and system
VIAF interoperability
http://isni.oclc.nl/sru/?query=pica.nw+%3D+%22maloy%2Brebecca%22
&operation=searchRetrieve&recordSchema=isni-b
This search is for the any records containing both “Rebecca” and
“Maloy” in the name
Response in XML enquiry response schema. ISNI enquiry response
v2.xsd
Quality
Usages
Harvard University Library 2014-11-18
10. Reports and Notifications
• Bulk reports
• Basic
• Enriched
• Notifications
• Ad hoc reports
• Report generator
• WinIBW download
• Statistics
ISNI database and system
VIAF interoperability
See document ISNI Data contributors reports and notifications guidelines.doc
http://www.isni.org/content/documents-related-data-submission-output
Quality
Usages
Harvard University Library 2014-11-18
11. Enriched Bulk report ISNI database and system
VIAF interoperability
Quality
Usages
Harvard University Library 2014-11-18
12. Notifications (PUSH after major update))
Someone
else has
matched &
details
You
probably
need to take
action
ISNI database and system
VIAF interoperability
Quality
Usages
Harvard University Library 2014-11-18
13. Statistics
Basic statistics
Cross matches
VIAF matches
ISNI database and system
VIAF interoperability
Quality
Usages
Harvard University Library 2014-11-18
14. VIAF Scope
• Persons
• Organisations
• Works / uniform titles
• Expressions
• Meetings
• Geographic
• All public data
ISNI database and system
ISNI Scope
• Persons
• + musicians, researchers
• Organisations
• (excluding sparse)
• (excluding undifferentiated)
• Includes private data
VIAF interoperability
Quality
VIAF and ISNI are Complementary Usages
Harvard University Library 2014-11-18
15. Harvard University Library 2014-11-18
ISNI database and system
VIAF interoperability
Quality
VIAF Scope Usages
16. So far we have added 2 million XR name title expression
records to VIAF. These have been data mined from
WorldCat. Note the expression indicate the translator
http://www.slideshare.net/JaniferGatenby/multilingualism-ifla-2014-08
Harvard University Library 2014-11-18
ISNI database and system
VIAF interoperability
Quality
Usages
17. VIAF Role
• Ingest authority
records from the
world’s major national
and research libraries
• Make clusters
• Expose and diffuse
ISNI database and system
ISNI Role
VIAF interoperability
• Create permanent IDs
• By batch
• On demand
• Diffuse those IDs
• Libraries, trade, rights
management,
professional societies,
educational institutions
Quality
VIAF and ISNI are Complementary Usages
Harvard University Library 2014-11-18
18. ISNI database and system
VIAF interoperability
VIAF Traffic 1.47 million sessions p.a.
Quality
Usages
Harvard University Library 2014-11-18
19. VIAF and ISNI are Complementary
VIAF System
• Harvester
• Clustering mechanism (re-clustered
monthly)
• 5 web interface languages
• Download in multiple formats
• Linked data & SRU
1 million personal visitors p.a.
ISNI database and system
ISNI System
VIAF interoperability
• Batch load
• Online request API
• Web site (English only)
• Allows end user input
• Member input and correction
• 16+ indexes
• SRU; linked data
• Quality Team monitoring &
correcting
• Diffusion, including corrections
Quality
Usages
Harvard University Library 2014-11-18
20. Sustaining Differentiation
1. Thomas, Russell Film director http://www.imdb.com/name/nm1306805/ Works:
Coldplay: live 2003, Really bend it like Beckham (2004), Bill Bailey Tinselworm
2. Thomas, Russell Brown, 1900- writing on education
3. Thomas, Russell Film director http://history.cfac.byu.edu/index.php/Thomas_Russell
Professor of media arts at Brigham Young University. Films include: Redemption,
Bonjour Danny Bonjour, Snell Show, Mr. Dungbeetle
4. Thomas, Russell Tenor from Miami http://www.russell-thomas.com/bio.asp
5. Thomas, Russell – professor of Music and Director of Jazz education at Jackson State
University http://www.jsums.edu/music/faculty/dr-russell-thomas/
6. Thomas, Russell J. 1966 writing on organic chemistry
7. Thomas, Russell N. 1973
8. Thomas, Russell Linwood, 1935 changed name to Al-Hajj Sayyd Abdul Al-Khabyyr –
saxophonist
9. Thomas, Russell A. Writing on retail
ISNI database and system
VIAF interoperability
Quality
Usages
Harvard University Library 2014-11-18
21. 2012
• ISNI /
VIAF
identifiers
2013
• Full
records;
ISNI a
VIAF
source
2014
• ISNI
records,
verification
mark
Synchronisation ISNI to VIAF
ISNI database and system
VIAF interoperability
Quality
Usages
Harvard University Library 2014-11-18
22. Scalable Quality Ecosystem
ISNI Database
Harvested, Batch loaded; Online contributions
Algorithms
Notifications
Data fixing
Sampling
Data Policy
Enrichment
Correction
Curation
Crowd
sourcing
Data contributors
Harvard University Library 2014-11-18
ISNI database and system
VIAF interoperability
Quality
Usages
23. End User Note
ISNI database and system
VIAF interoperability
It seems 2 ISNIs has been assigned to the French
singer Laïka Fatien (born 1968 in Paris): ISNI 0000
0000 8065 8419 and ISNI 0000 0000 7238 637X. I
think the last one can be deleted.
Quality
Usages
Harvard University Library 2014-11-18
24. End User Note
ISNI database and system
VIAF interoperability
Dear Sir / Madam, The ISNI 0000000117488848 refers to "Marco Antonio
Casanova", Professor at the Catholic University of Rio de Janeiro. I am not
the author of "Fragmentos póstumos. - Nietzsche uma introdução
filosófica" or "Segunda consideração intempestiva da utilidade e
desvantagem da história para a vida". The author of these works is "Marco
Antonio dos Santos Casa Nova". You may confirm this information by
consulting our CVs at the Brazilian Research Council: Marco Antonio
Casanova
(me): http://lattes.cnpq.br/0400232298849115 Marco Antonio dos Santos
Casa Nova
(the other author): http://lattes.cnpq.br/3409704326617178
Quality
Usages
Harvard University Library 2014-11-18
25. End User Input – enriches, merges, splits
c. 10 to 50 a week and growing
Nightly email to QT
QT being enlarged with more national libraries
Goal – response within 3 days
Maintains clusters at cluster level
All concerned sources are notified
ISNI database and system
VIAF interoperability
Quality
Usages
Harvard University Library 2014-11-18
26. La Trobe University Links: 3,427
ISNI database and system
VIAF interoperability
Linked Data: isni.org/isni/
Quality
Usages
Harvard University Library 2014-11-18
27. La Trobe University: 1,864 VIAF Links
Linked Data: isni.org/isni/
ISNI database and system
VIAF interoperability
Quality
Usages
Harvard University Library 2014-11-18
28. Iconoclaste, Québec ISNI database and system
VIAF interoperability
Quality
Usages
Harvard University Library 2014-11-18
29. Irish Copyright Licensing Agency ISNI database and system
VIAF interoperability
Quality
Usages
2423
records
entered
online
Harvard University Library 2014-11-18
30. ISNI
researchers
ORCID
Assigned
2,553,949 1 million
Links
6,464,362
ISNI database and system
VIAF interoperability
Quality
Usages
ORCID Interoperability
ISNI ORCID
Public Identity Self
Registration
Linked data;
works
associations
Submitting
articles for
publication
Curated crowd
sourcing
Grant
submissions
• ORCID IDs are ISNIs
• Identical structure and check digit calculation
• Reserved range supplied to ORCID and cannot
be applied by the ISNI system
• ORCID using SRU API into ISNI
• Reciprocal technical representation
• La Trobe wants to populate ORCID from ISNI
• Which is the preferred identifier? It is simple
technically to:
• Store both, allow enquiry on both, resolve both &
• Display and diffuse one as the preferred identifier
Harvard University Library 2014-11-18
31. OCLC ResearchTask Forces Support ISNI ISNI database and system
VIAF interoperability
Quality
• http://oclc.org/research/publica Usages
tions/library/2014/oclcresearch
-registering-researchers-2014-
overview.html
Innovative clustering
Organisations in ISNI Task Force (OCLC Research
Partners)
Looking at quality, completeness, diffusion and
engagement
Harvard University Library 2014-11-18
Notas del editor
Multiple domains
ISNI makes available to the public 8.69 million identities. Among these identities it has created 21.38 million links. These links do not count the links between VIAF sub sources; i.e. they are really valuable links because they are likely to lead to different information
For batch loading, ISNI assignment is not guaranteed.
Criteria for assignment: 2 or more independent sources or 3 VIAF sources, or the name is unique. Unique name assignment requires the forenames to be complete (i.e. not initials), and the metadata to not be sparse. Some sources whose data is trusted to be fully differentiated and deduplicated are used as base data and single source assignment.
For online assignment applications ISNI assignment is assured providing that the data passes the sparseness test and an assertion is made that the database has already been searched.
The public only see the blue section. Members are able to see the entire database and most detail in the record (except private data)
The blue box indicates the location of documentation on enquiry of the database. The indexes availbale in the members view are extensive, permitting multiple views of the data. Example are in the pink box – e.g. combining a source code with update data, or with another source.
The system also includes a browse capability on most indexes. On the left is the name index and on the right a more unusual index, permitting to find records by number of sources.
Machine to machine enquiry. This is available in both private and public mode
Reports are sent after each major update and also available on demand. Bulk reports are on request, notifications are pushed from the system.
When events occur on records in the ISNI database, all sources concerned are notified. The notification is in the form of a regular monthly XML report. Notification fields for matches (tells you someone else has matched your data)
Recipient source code (028C $2)
Source of incoming record
Date/time of match
Matching data string
Matching data type (name and dates, name and title, partial name, date, title)
Matching score
Total evaluation score
Date/time stamp of notification
Notification fields for errors (you need to take action)
Type of error: merge, duplicate, dataError or split
Recipient source code
Recipient local identifier
Date/time stamp of field creation
Data field contents
Data field identifier, (e.g 021A = title)
Date/time stamp of notification
Should be
Correct ISNI
ExplanatoryText
ISNI’s scope overlaps but is not identical to VIAF’s scope. For persons, ISNI includes all VIAF (except sparse and undifferentiated records) plus includes many persons involved with music and research not present in VIAF.
Also, unlike VIAF, ISNI includes private data that may be used for matching but not displayed or diffused publically. Such data includes dates of birth (actors in particular do not like their dates of birth publicized because it limits the parts that they are offered). Rights management associations are also not permitted to reveal the relationships between real persons and pseudonyms. Witness the recent case of JK Rowling publishing crime novels under a pseudonym and being irked that her cover was revealed by her Lawyers.
The is from the last VIAF annual report. Most of the records in VIAF are for persons. Note that VIAF has 35 million persons but ISNI has fewer – because it eliminates spare and undifferentiated records.
The VIAF annual report does not include the 2 million records that have been generated by mining WorldCat for translations. This is my other major project at OCLC and more information can be found on slide share.
ISNI’s role is different from VIAF’s. ISNI creates a permanent ID and is required to keep the ID as stable as possible, and where it changes must diffuse corrections. ISNI diffuses cross domain – libraries, trade, rights management, professional societies, education.
This slide shows the traffic on the VIAF file; the sister file of ISNI. This is a page from the annual report. This shows 1.47 million sessions in a year. ISNI’s figures do not compare, we have recently reached 14,000 visits a day.
ISNI includes an online request and maintenance capability
Improved data quality and confidence
Anomaly reports – 7,000 date anomalies (>50% represent real errors)
Merge, split and data error reports (c. 5,000)
Matching improvements
Dates, common surnames, longest name form, weightings, new elements
Detection of UNIMARC Conversion errors
parallel main names, name variant conversion, related names conversion, missed data
Pseudonyms
Feedback, record links (c. 70,000)
More widely diffused linked data
Proposal for inter-operation – joint notification, shared maintenance
This slide indicates how difficult it can be to sustain differentiation with a purely harvesting model. In this example, there are 9 identities with the name Thomas, Russell (and even more Russell, Thomas). They are easy to confuse 2 film directors, 3 musicians. Once ISNI fixes up these records by splitting, it puts a VIAF protect flag on the records to prevent further automatic update and it also notifies VIAF via a verification flag that signals to VIAF to treat these records as XA police records.
In 2012, ISNIs were sent to VIAF. In 2013, the decision was taken to includes ISNI as a source in VIAF so ISNI started sending full records to VIAF for all assigned records that contained a VIAF code, including all restricted data from nonVIAF sources. In 2014, as well as records containing a VIAF code, records containing an ISNI code are now being sent to VIAF, including those created by the pseudonym programs or created manually by the ISNI Quality Team. The VIAF records that have been edited manually by the ISNI quality team contain a verification mark so that it can be used in the VIAF clustering process.
Living online database. ISNI’s system has a focus on quality. The data contributors load data and are responsible for the quality of their own data. OCLC as the assignment agency
Web interface for error reporting, enriching, detecting duplicates for data contributors
Web interface for public
Client for full maintenance including streamlined procedures* for Quality Team
Notifications to data contributors
Data Sampling*
Data Anomaly checks (dates, pseudonyms)*
Fixes to incoming data (pre and post load)*
Data enrichment to increase matching (Dewey)*
This is a typical input from an end user of the ISNI database. The requests are coming in on average 2-3 a day. The requests are almost all very high quality as per above and most (to our surprise) include an email so that we respond with the action taken. ISNI also engages to notify all sources in case of a fixed error.
End user input is giving recommendations for merging, splitting and enriching.
La Trobe University is our pilot site for loading data from an institution registry. By loading, they found that half their data matched with the 12 ISNI sources for researchers. Thus they only have to look at the other half. The first load created 437 links to VIAF cluster records but 1,864 links to VIAF sources. They are aiming to achieve 100% assignment by the end of the year by using the web interface to find records to manually merge or enriching the data to achieve online assignment.
Iconoclaste in Québec has created software that makes a « Birth certificate » for music. During the process it access the ISNI database using SRU enquiry and retrieves name and ISNI. This can be accepted or refused. If nothing is found or the retrieved data is refused, the software uses the AtomPub API to create and ISNI assignment request and thus receives an ISNI.
In the “real” Dublin, the ICLA has chosen to enter all their data using the web interface. It is impressive that they have created 2.423 records using this interface.
Total researchers in ISNI 3,55 million; 990,000 provisional records in addition to those above
ISNI and ORCID play complementary roles. ISNI enables a researcher to find his public identity and to correct it via the “yellow box”. ISNI creates links among its sources who are able to re-distribute to sub sources. ORCID is self registering and self diffusing.
OCLC Research is supporting ISNI in multiple projects:
OCLC Research partners task force on organisations in ISNI.
Similarity vector clustering techniques being looked into by 2 OCLC researchers in the Leiden office
OCLC Research task force on representing researchers in authority files
Linked data surveys, including ISNI.