Presentation given by Catherine Hardman of the Archaeology Data Service in York.
The presentation was given at the 'Managing Archaeology Data' event on Monday 7th March 2011 at the University of Glasgow.
9. The scale of the problem in the 1990s Strategies for protecting physical media Findings and Recommendations from ‘Digital Data in Archaeology: A Survey of User Needs’ Condron et al 1999
11. The scale of the problem in the 1990s The popularity of storage options Findings and Recommendations from ‘Digital Data in Archaeology: A Survey of User Needs’ Condron et al 1999
12. 8" Floppy 3.5" Floppy 5.25" Floppy 12" Optical Disk 5.25" Optical Disk CD-ROM Sparq Disk Cartridge Zip Disk Click! DVD-ROM Jaz Disk Floptical Disk Punch Tape Rectangular Hole Punch Card IBM 3480 DLT Tape DG90M Tape DC4_120 8mmD-eight QIC DC600 G2000 Tape 4mm Tape Ditto Max 9-Track Ree l Cassette tape Memory Stick MultiMedia Card SD Memory Card xD Picture Card Smart Media CompactFlash Travan
13.
14. How do we do it? Open Archival Information System (OAIS)
16. Migration based approach & controlled ingest Aim to connect with data producers early on in their project lifecycles to ensure that preservation planning is a key consideration during the project rather than an afterthought.
19. The size of digital archives held by different types of archaeological bodies http://ads.ahds.ac.uk/ A rchaeology D ata S ervice
20. Big Data Project Roughly how much data would be generated by a single project?
21. Which of these data collection techniques do you carry out? Technologies used 12% 4% 4% 3% 8% 1% 3% 11% 9% 9% 7% 14% 3% 12% 3D Laser Scanning Sidescan Sonar Multibeam Scanning Single Beam Scanning Geophysics Acoustic Tracking Sub bottom profiling Geographic (eg GIS) Lidar Digital Video Video Movie Clips Still Images CAD (2D or 3D) Other
How big is your data? – asked in order to get a idea of scale of the problem So you’ll see there is some quite big data being produced out there – some people producing over 200GB for a project
We ran an online questionnaire to find out about users and uses of big data – I’ll just skim through some of the things that came out of it: We got 48 responses. this is one of the first questions we asked. Wanted to get an idea of the data collection techniques that people are using to create big data. You’ll see there’s a wide range of technologies including the ones I mentioned on an earlier slide.
Of the 101 software packages entered into the online form a staggering 52 are unique (that is after editing for things like lower and upper case character differences). It seems the world of ‘big data’ is very fragmented.
This is an interesting one. We asked if people had an archival policy for the data sets in question. Only 48% of respondents note that they have a policy in place Of these many noted that these policies were localised and incomplete - not formal written policy. A proper system of digital archiving should involve continuous active management of the data, putting data on a dvd and putting it in a drawer is not really a stable archival policy. A formal archival policy as we see it should ideally be based on the OAIS system – continuous active management of data to ensure its survival into the future.
Overwhelming “yes” to this question.... Some of the reasons that were cited: monitoring over time avoiding duplication Saving time/money Of course – re-use just isn’t possible unless someone is archiving and providing access to this data