A presentation by Susanne Thorbord, Bibliographic Consultant at the Danish Bibliographic Centre (DBC).
Delivered at the Cataloguing and Indexing Group Scotland (CIGS) Linked Open Data (LOD) Conference which took place Fri 21 September 2012 at the Edinburgh Centre for Carbon Innovation.
1. The Danish National
Bibliography as LOD
Presentation for
CATALOGUING & INDEXING GROUP IN
SCOTLAND
2nd Linked Open Data Conference
21st September 2012
Susanne Thorborg, Bibliographic consultant,
DBC- Danish Bibliographic Centre
2. • DBC is a public limited company owned by the
by Local Government Denmark and the state
• DBC produces the maily parts of The Danish
National Bibliography (books, articles, audio,
recorded music, images and movies) – the rest
is produced by The Royal Library.
• DBC’s part of the national bibliography is
produced according to a contract between the
The Ministry of Culture and DBC
3. DBC also …
• operates and develops DanBib – the Danish
Union Catalogue and superstructure system for
the entire Danish library service
• operates and develops bibliotek.dk - the
citizens’ OPAC to the Union Catalogue
4. Why do DBC take an interest in LOD?
Because
bibliographic development is a strategic goal for DBC
LOD has great perspectives for the library community
5. DBC's LOD project
The project's aim
Learning more about Linked Open Data
– in particular in the libraries
Getting practical experiences with publishing
bibliographic data as LOD
6. The approach
Learning by doing
Complete appplication - but in a small scale
Technical platform and tools "whatever means
available"
Reuse and test others experiences – not reinventing
the wheel
Agile process (Pragmatism rather than "100%" )
7. The preliminaries – some questions
Which licence?
How to go from MARC to RDF? (the data model)
Reuse of relevant vocabularies and element sets – which to
choose?
Serialization – what are we going to choose?
Converting – how, and with which tools?
Dereferenceable URIs – how?
Bridging and mapping – how do we find the "good" links?
How to publish our own data?
Which technical platform?
8. What we have done – the challenges
The data
The data model
Converting data from danMARC2 to RDF
Making outgoing links
Dereferenceable URIs
Publishing
– Choice of licence
– Basic RDF/XML on DBCsite
– Load in triplestore (Mulgara)
– Set up SPARQL interface
9. The data
Bibliographic records:
– National bibliography, books 2010-
(Aprox. 47.000 records)
Authority records (names of persons):
– Corresponding authority records
(Aprox. 27.000 records) NB! Test records!
10. The data model
Design a simpel RDF-data model for "books"
Choose suitable element sets
- and vocabularies
We looked at BL – but we could also have used
Open metadata registry http://metadataregistry.org/
12. Converting data from danMARC2 to RDF
Mapping danMARC2 to RDF
Converting tools
13. Publishing
Choice of licence
Basic testfiles (RDF/XML and turtle) on oss.dbc.dk
besides general information about the project
Link to Mulgara triplestore on lod.dbc.dk
Default SPARQL interface
14. What we haven't completed yet
Outgoing links
– Choosing suitable dataset
• Viaf
• DBpedia
– Tools for establishing links to other datasets wanted!
Dereferenceable URIs
Output from triple store in various formats
15. What's next?
Make URIs derefenceable
Outgoing links, outgoing links and outgoing links!
Expand the data model
Refining conversion to RDF
Automation of processes
Find another tripple-store
Share data on the Data Hub
… still a long way to go
16. Some lessons learned
it's a huge landscape and it's very easy
to get lost
Therefore
there is a great need for guidelines and recommendations
for best practice
stable and solid tools for linking between datasets are
wanted
17. we also learned….
to appreciate other people's results and experiences
to break with our "100% syndrome"
18. Thank you for your attention!
Contact: Susanne Thorborg – mail: st@dbc.dk