1. Department of Parliamentary Services
Parliamentary Library
Semantic Web Technologies at the
Victorian Parliamentary Library
Peter Neish, Systems Officer
Victorian Parliamentary Library
@peterneish
2. Department of Parliamentary Services
Parliamentary Library
The Library
• Established in 1851
• Clients:
– Members of Parliament and their staff
– Department of Parliamentary Services
(especially committees)
– Academics and public
• Historical information, Hansard, parliamentary
papers, government agencies, member
biographies
• Provide current information to members e.g.
news clips, video, journal articles, media
releases
3. Department of Parliamentary Services
Parliamentary Library
Number of Media Releases per year
7000
6000
5000
4000
3000
2000
1000
0
92
94
95
96
98
99
01
02
03
05
06
07
08
09
10
93
97
00
04
19
19
19
19
19
20
20
20
20
20
20
20
20
19
19
19
20
20
20
4. Department of Parliamentary Services
Parliamentary Library
Semantic Tagging
• Increasing number of media • Web services examined:
releases meant that manual – Alchemy API
indexing was too time
– Evri
consuming
– OpenAmplify
• Examined ways of automatically
– OpenCalais
tagging media releases without
human intervention – Yahoo Term Extractor
– Zemanta
5. Department of Parliamentary Services
Parliamentary Library
Open Calais
• Product of Thomson Reuters – focus is on news articles
• Generous limits on API calls
• Data in RDF/XML, N3, Simple Text, Microformats, JSON
• Good documentation and community
• http://viewer.opencalais.com/
However: closed box (algorithm secret), recently company
appear to have scaled back development
6. Department of Parliamentary Services
Parliamentary Library
Number of Tags assigned by OpenCalais
4500
4000
3500
3000
Total number
2500
2000
1500
1000
500
0
0 20 40 60 80 100 120
Tags per item
7. Department of Parliamentary Services
Parliamentary Library
6%
5% Tag Quality
4%
Correct Tags
Incorrect Tags
Repeated Tags
Redundant Tags
85%
8. Department of Parliamentary Services
Parliamentary Library
Open Calais RDF
• OpenCalais links to its own ontology
(rich in data for companies, but other
classes have limited data)
• RDF has a lot of N-ary relationships
(up to 1000 triple statements per
article)
• SameAs or web links to:
– DBpedia, Wikipedia, Freebase,
Reuters.com, GeoNames,
Shopping.com, IMDB, LinkedMDB
10. Department of Parliamentary Services
Parliamentary Library
Current Projects using Linked Data
• Government Agencies Database
• Parliamentary Papers
• Images
Notas del editor
Talk about the Parliamentary Library: Established in 1851, building itself 1858–60