This slidedeck gives an overview of Dutch e-humanties projects that build upon the datasets of the Koninklijke Bibliotheek, the national library of the Netherlands.
It focuses on 8 projects that reuse the digitized historical newspapers (1618-1995) of the KB.
It was presented on 7-1-2014 at the Huygens Institute for the History of the Netherlands (Huygens ING for short). This is an institute of the Royal Netherlands Academy of Arts and Sciences (KNAW) where around 100 scholars work in the largest humanities institute of the Netherlands.
Keywords: biland,delpher,e-humanities,elite network shifts,hirods,historical newspapers,isher,koninklijke bibliotheek,national library of the netherlands,open data,polimedia,political mashup,reuse,sealincmedia,translantis,washp
Reusing historical newspapers of KB in e-humanities - Case studies and examples of research projects
1. Reusing historical newspapers of KB
in e-humanities
Case studies and examples of research projects
http://www.kb.nl/sites/default/files/kranten.jpg
http://germanics.washington.edu/sites/germanics/files/images/digital_humanities_wordle.p
Huygens ING, 07-01-2014
Olaf Janssen, Koninklijke Bibliotheek, National Library of the Netherlands
olaf.janssen@kb.nl - @ookgezellig - slideshare.net/OlafJanssenNL
3. What I hope you’ll get out of this talk
Improved understanding of
1.
e-humanities research projects using KB historic
newspapers
2.
reuse potential of KB datasets in e-humanities
5. Downsides of Delpher webinterface
“Readers [..] are disempowered by machinery that allows them
only to choose among options that have been pre-scripted’” (Liu *)
Datasets & APIs offer more flexibility
* Marlene Manoff, ‘The Materiality of Digital Collections. Theoretical and Historical Perspectives’,
Portal: Libraries and the Academy 6 (2006) 311-325, p 320.
http://dspace.mit.edu/bitstream/handle/1721.1/35689/6.3manoff.pdf
6. Open datasets & APIs of KB
kb.nl/dataservices
These are only the
(semi-)open sets
KB has more sets
on offer!
7. KB datasets, Delpher and openness
KB dataset
Abbreviation
Set in Delpher?
Set described on
kb.nl/dataservices
?
Openness
Early Dutch
Books Online
EDBO / DPO
Yes, Boeken
Basiscollectie
Yes, link
Metadata : CC0
Objects : Public Domain
Historic
Newspapers
1618-1995
DDD
Yes, Delpher Kranten
No
Mixed (<1900 Public Domain)
Available on demand for
scientific research
Radiobulletins
from ANP
ANP
Yes, Delpher
Radiobulletins
Yes, link
Metadata : CC0
Objects: CC-BY-NC-ND
Proceedings
of Parliament
1814-1995
SGD
No
Yes, link
Metadata : CC0
Objects : CC0
Dutch
Magazines
19th + 20th c
DTS
Yes, Delpher
Tijdschriften
No
Mixed (<1900 Public Domain)
Available on demand for
scientific research
Watermarks in
Incunabula in
the Low
Countries
WILC
No (images)
Yes, link
Metadata : CC0
Objects : CC0
Medieval
Illuminated
Manuscripts
MVH /
Byvanck
No (images)
Yes, link
Metadata : CC0
Objects : Public Domain
8. KB datasets, Delpher and openness
KB dataset
Abbreviation
Set in Delpher?
Set described on
kb.nl/dataservices
?
Openness
Early Dutch
Books Online
EDBO / DPO
Yes, Boeken
Basiscollectie
Yes, link
Metadata : CC0
Objects : Public Domain
Historic
Newspapers
1618-1995
DDD
Yes, Delpher Kranten
No
Mixed (<1900 Public Domain)
Available on demand for
scientific research
Radiobulletins
from ANP
ANP
Yes, Delpher
Radiobulletins
Yes, link
Metadata : CC0
Objects: CC-BY-NC-ND
Proceedings
of Parliament
1814-1995
SGD
No
Yes, link
Metadata : CC0
Objects : CC0
Dutch
Magazines
19th + 20th c
DTS
Yes, Delpher
Tijdschriften
No
Mixed (<1900 Public Domain)
Available on demand for
scientific research
Watermarks in
Incunabula in
the Low
Countries
WILC
Medieval
Illuminated
Manuscripts
MVH /
Byvanck
Let’s now look at some e-humanities
No (images)
Yes, link
projects that build upon these 3 sets
Metadata : CC0
Objects : CC0
(with a focus on Historic Newspapers
1618-1995 – DDD)
No (images)
Yes, link
Metadata : CC0
Objects : Public Domain
9. For now I’ll focus on 2 e-humanities projects
1.
2.
Polimedia
Political Mashup
You can self-study
3.
4.
5.
6.
7.
8.
Translantis
HiRoDs
Elite network shifts
ISHER
WASHP | BILAND
SEALINCMedia
10. 1. Polimedia - What
• Connects transcripts of Dutch Parliament
with media coverage in newspapers and
radio bulletins
• Improved analyses of radio & newspaper
coverage of political debates
• KB = data supplier
- SGD (1945-1995)
- DDD (1945-1995)
- ANP (1945-1984)
• www.polimedia.nl
11. 1. Polimedia - Who
Build by team from
•
•
•
•
Erasmus University Rotterdam
VU University Amsterdam
TU Delft
Netherlands Institute for Sound and Vision
19. 2. Political Mashup - What
• Research programme to make Dutch political data
(1814-present) more understandable and
accessible
• Creation of rich semantically annotated dataset in
XML
For every word ever spoken in Dutch parliament we know
- who said it
- what political party the speaker belonged to
- which role the speaker fulfilled
- when it was said
- to whom it was said
- in which context it was said
• Datasets of KB reused
- Proceedings of Parliament 1814-1995 (SGD)
- Historic newspapers (DDD)
• http://politicalmashup.nl/
• http://politicalmashup.nl/over-political-mashup/
• http://search.politicalmashup.nl/
20. 2. Political Mashup - Who
Team around Maarten Marx (Univ of Amsterdam)
•
•
•
Historians (Groningen Univ)
Language technologists (Univ of Tilburg, UoA)
Computer scientists (UoA)
Datasets & suppliers
•
Proceedings of Parliament 1814-1995
•
Proceedings of Parliament 1995-present
•
Koninklijke Bibliotheek
officielebekendmakingen.nl
Political party + election manifestos, websites of
political parties
Centre for Documentation of Dutch Political Parties (DNPP)
•
Biographical data political parties & politicians
Centre for Parliamentary Documentation (PDC)
http://ccct.uva.nl/content/project/politicalmashup
21. Political Mashup & SGD – Polidocs
Polidocs
• Proof-of-concept search engine for
Proceedings of Parliament 19842008
• Part of Political Mashup
programme, by students UoA
• www.polidocs.nl
• www.polidocs.nl/about.html
22. Political Mashup & SGD – Ngram viewer
Political ngram viewer
• Frequency of words (or phrases) in
Dutch parliament through time
• Inspired by Google ngram viewer
• http://ngram.politicalmashup.nl
23. Oct 2012: Political Mashup wins Dutch Data Prize 2012
http://politicalmashup.nl/2012/10/politicalmashup-wint-nederlandse-dataprijs-2012
“Elke burger wordt door Political Mashup in staat gesteld zowel voor het verre verleden,
als voor de dag van gisteren, het politieke proces beter te doorgronden en ook zijn
gekozen volksvertegenwoordigers effectief te controleren. Het bevordert daarmee niet
alleen de betrouwbaarheid (en herhaalbaarheid) van het politiek-historisch onderzoek,
maar stelt de burger ook in staat echt democratisch burger te zijn.”
24. Political Mashup & SGD - Further reading
http://kb.nl/sites/default/files/docs/politicalmashup-casestudy-hergebruik-open-data.pdf
25. Political Mashup & DDD – Newspaper stats
http://politicalmashup.nl/2013/03/de-omvang-van-het-kb-kranten-archief/
26. Political Mashup & DDD – M eats in newspapers
http://politicalmashup.nl/2013/03/vleesch-in-de-nederlandse-krant/
29. Political Mashup & DDD - Further reading
Marx & Nusselder, 2013
https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsd
GRvbWFpbnxkZXZlbG9wbWVudGR1dGNobGFuZ3VhZ2V8Z3g6NzlkM
TJlZjE3MDRjZWFiNA
30. 3. Translantis
• Emergence of the United States in public
discourse in the Netherlands
• Analysis how USA served as cultural
model for the Netherlands in the
period 1890-1990
• Utrecht University, University of
Amsterdam, ING Huygens
• Texcavator text mining tool
• KB = provider of newspapers 1890-1990
• http://translantis.wp.hum.uu.nl
31. Texcavator text-mining tool
• Based on open-source text analytics
platform xTAS
(University of Amsterdam)
• Used in projects
- Translantis
- Political Mashup (see 2.)
- WASHP & BILAND (see 7.)
- Infiniti
• http://dev.wahsp.nl/texcavator
(login?)
• http://xtas.net/demonstrators
ShoShin (KB newspapers)
32. 4. HiRoDs
• HIstorical Roots of the Dutch Sustainability
Challenge
• Tracing historical roots of current
sustainability problems in NL by
analyzing historical data on economic
development, flows of materials, energy
use etc. from 1850 onward
• KB = provider 19th c. newspaper data
• http://www.nwo.nl/onderzoek-enresultaten/onderzoeksprojecten/48/230016
7248.html
33. 5. Elite network shifts
• Formation, circulation and relocation
of elites during regime transitions of
1945–50 and 1998 in the Netherlands
Indies and Indonesia, by analyzing
digitized newspapers
• KITLV, NIOD, University of Amsterdam,
Bandung Institute of Technology, Erasmus
University, DANS
• KB = (potential) provider of Indonesian
language newspapers 1945-1957
• http://www.kitlv.nl/home/Projects?id=25
• http://www.ehumanities.nl/computationalhumanities/elite-network-shifts/
34. 6. ISHER
• Integrated Social History Environment for
Research
• Detect & associate events, trends,
people and organisations related to
social unrest (e.g. strikes) in historical
newspapers
• KB = provider of newspaper data
• http://www.nactem.ac.uk/DID-ISHER
• http://www.diggingintodata.org/LinkClick.a
spx?fileticket=3XQTLdzggoo%3D&tabid=19
6
35. 7. WAHSP | BILAND
• WAHSP: web-application for sentiment
mining in historical public media
(newspapers, magazines and radio bulletins)
• Example: research public sentiments
around drugs using Dutch newspapers
1900-1945
• Texcavator text mining tool
• BILAND: extend tool for bilingual
research
Also include German newspapers
• KB = provider of newspaper data
• http://www.biland.nl
• http://dev.wahsp.nl/texcavator (login?)
36. 8. SEALINCMedia
• Socially Enriched Access to LINked
Cultural Media
• Modeling and evaluating (social,
web2.0) human input for
multimedia content access, with
the aim to integrate it in
automatic data analysis
• Public-private partnership
3 universities, 1 scientific institute, 4 technology
companies, 5 heritage institutions (data providers)
• KB = provider of newspaper data
•
http://www.commit-nl.nl/projects/sociallyenriched-access-to-linked-cultural-media
•
http://sealincmedia.wordpress.com/