How to find 50,000 maps in a haystack of 1,000,000 images; geolocate them, and categorise them ... on a budget of no or not many euros.
The 1,000,000 image collection extracted by the British Library from 19th-century books is a wonderful resource — but one Wikimedia Commons felt it could not accept, other than through exhaustive hand-uploading, because without good metadata about the subject of the image at the image level, the images could not be made categorisable and so would simply not be discoverable. This talk describes a joint BL/Wikimedia initiative to systematically go through the images, which discovered 50,000 maps in eight weeks. Geo-location of these map images then makes it possible to use automated tools to help group them and organise them and categorise them in different ways, the key step to making them valuable and reusable.
(5-minute "ignite" talk, given at the start of Europeana Tech 2015)
Mapping the Maps (Europeana Tech, Feb 12, 2015 - Ignite talk)
1. Case Study: Mapping the Maps
How to find 50,000 maps in a haystack of 1,000,000 images;
geolocate them, and categorise them
... on a budget of no not many euros.
James Heald,
Wikimedia volunteer
@heald_j
Kimberly Kowal,
British Library
Kimberly.Kowal@bl.uk
25. -- including 20,000 found independently by @Quasimondo,
machine-assisted using his own pattern recognition methods
50,000 maps in all:
classmark detailed
totals index index
------ ---------- -----------
misc 16074 14091 1983
Europe 13136 6254 6882
British Isles 7191 269 6922
North America 6758 1524 5234
USA 5782 1209 4573
Asia 2736 1280 1456
Africa 2300 1075 1225
South America 895 659 236