SlideShare una empresa de Scribd logo
1 de 34
Managing large and complex data sets: …  THE CHALLENGES OF ARCHIVING AND ONLINE DELIVERY CATHERINE HARDMAN
The problem….in 1996 My lithics report here, on floppy disc
The ADS: some ancient history ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What do we do? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
No need for digital preservation Domesday Book: Publisher: William of Normandy (1086) – still readable
Where’s preservation when you need it? Domesday Disc: Publisher: BBC (1986) –nearly lost
Why is it important?
[object Object],What’s the problem? Information Entropy
The scale of the problem in the 1990s Strategies for protecting physical media Findings and  Recommendations from ‘Digital Data in Archaeology: A Survey of User Needs’ Condron et al 1999
Protecting Physical media … never the twain
The scale of the problem in the 1990s The popularity of storage options Findings and  Recommendations from ‘Digital Data in Archaeology: A Survey of User Needs’ Condron et al 1999
8" Floppy 3.5" Floppy 5.25" Floppy 12" Optical Disk 5.25" Optical Disk CD-ROM Sparq Disk Cartridge Zip Disk Click! DVD-ROM Jaz Disk Floptical Disk Punch Tape Rectangular Hole  Punch Card IBM 3480 DLT Tape DG90M Tape DC4_120 8mmD-eight QIC DC600 G2000 Tape 4mm Tape Ditto Max 9-Track Ree l Cassette tape           Memory Stick MultiMedia Card SD Memory Card xD Picture Card Smart Media CompactFlash Travan
Why is it all so difficult? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
How do we do it? Open Archival Information System (OAIS)
But that’s people…
Migration based approach & controlled ingest Aim to connect with data producers early on in their project lifecycles to ensure that preservation planning is a key consideration during the project rather than an afterthought.
Guides to help you do all that.
It hasn’t really got much easier ,[object Object]
The size of digital archives held by different types of archaeological bodies  http://ads.ahds.ac.uk/ A rchaeology  D ata  S ervice
Big Data Project Roughly how much data would be generated by a single project?
Which of these data collection techniques do you carry out?  Technologies used 12% 4% 4% 3% 8% 1% 3% 11% 9% 9% 7% 14% 3% 12% 3D Laser Scanning Sidescan Sonar Multibeam Scanning Single Beam Scanning Geophysics Acoustic Tracking Sub bottom profiling Geographic (eg GIS) Lidar Digital Video Video Movie Clips Still Images CAD (2D or 3D) Other
What are the main software packages you use ?
Do you have an archiving policy for the data sets / types in question?
back-up
When you start a new project …would you consider using existing datasets?
This is the opportunity!
 
Making the inaccessible accessible ,[object Object]
Blurring the distinction … … between publication and archives …
Making the LEAP…
 
What does that mean for you? ,[object Object],[object Object],[object Object],[object Object]
How do you do that? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
We’re here to help ,[object Object],[object Object],[object Object]

Más contenido relacionado

Similar a Managing large and complex data sets

Research Data Curation _ Grad Humanities Class
Research Data Curation _ Grad Humanities ClassResearch Data Curation _ Grad Humanities Class
Research Data Curation _ Grad Humanities ClassAaron Collie
 
Research Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsResearch Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsAaron Collie
 
Project CAiRO Overview
Project CAiRO OverviewProject CAiRO Overview
Project CAiRO OverviewStephen Gray
 
Planning and Managing Digital Library & Archive Projects
Planning and Managing Digital Library & Archive ProjectsPlanning and Managing Digital Library & Archive Projects
Planning and Managing Digital Library & Archive Projectsac2182
 
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...IFLAAcademicandResea
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introductionbutest
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data ManagementJamie Bisset
 
Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Alexandru Iosup
 
Digitisation Workshop Pres 2008(V1)
Digitisation Workshop Pres 2008(V1)Digitisation Workshop Pres 2008(V1)
Digitisation Workshop Pres 2008(V1)Mal Booth
 
Digital Preservation
Digital PreservationDigital Preservation
Digital PreservationSmita Chandra
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservationsmtcd
 
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondDigital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondBenoit Pauwels
 
Digital Presentation Best Practices: Lessons Learned From Across the Pond
Digital Presentation Best Practices: Lessons Learned From Across the PondDigital Presentation Best Practices: Lessons Learned From Across the Pond
Digital Presentation Best Practices: Lessons Learned From Across the PondULB - Bibliothèques
 
Navigating the Analog Waves: Digitizing Audio Cassettes for Your Collection
Navigating the Analog Waves: Digitizing Audio Cassettes for Your CollectionNavigating the Analog Waves: Digitizing Audio Cassettes for Your Collection
Navigating the Analog Waves: Digitizing Audio Cassettes for Your CollectionKay Gregg
 

Similar a Managing large and complex data sets (20)

Research Data Curation _ Grad Humanities Class
Research Data Curation _ Grad Humanities ClassResearch Data Curation _ Grad Humanities Class
Research Data Curation _ Grad Humanities Class
 
Research Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsResearch Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering Students
 
Cs501 dm intro
Cs501 dm introCs501 dm intro
Cs501 dm intro
 
Project CAiRO Overview
Project CAiRO OverviewProject CAiRO Overview
Project CAiRO Overview
 
Cairo
CairoCairo
Cairo
 
Planning and Managing Digital Library & Archive Projects
Planning and Managing Digital Library & Archive ProjectsPlanning and Managing Digital Library & Archive Projects
Planning and Managing Digital Library & Archive Projects
 
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introduction
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.
 
Getaneh Alemu
Getaneh AlemuGetaneh Alemu
Getaneh Alemu
 
Digitisation Workshop Pres 2008(V1)
Digitisation Workshop Pres 2008(V1)Digitisation Workshop Pres 2008(V1)
Digitisation Workshop Pres 2008(V1)
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Data in Action
Data in ActionData in Action
Data in Action
 
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondDigital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the Pond
 
Digital Presentation Best Practices: Lessons Learned From Across the Pond
Digital Presentation Best Practices: Lessons Learned From Across the PondDigital Presentation Best Practices: Lessons Learned From Across the Pond
Digital Presentation Best Practices: Lessons Learned From Across the Pond
 
Navigating the Analog Waves: Digitizing Audio Cassettes for Your Collection
Navigating the Analog Waves: Digitizing Audio Cassettes for Your CollectionNavigating the Analog Waves: Digitizing Audio Cassettes for Your Collection
Navigating the Analog Waves: Digitizing Audio Cassettes for Your Collection
 
Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
 
William Kilbride
William KilbrideWilliam Kilbride
William Kilbride
 

Más de data_management

Experiences (mis)managing archaeological data
Experiences (mis)managing archaeological dataExperiences (mis)managing archaeological data
Experiences (mis)managing archaeological datadata_management
 
Support in TFTS, Glasgow
Support in TFTS, GlasgowSupport in TFTS, Glasgow
Support in TFTS, Glasgowdata_management
 
Providing technical support during research projects
Providing technical support during research projectsProviding technical support during research projects
Providing technical support during research projectsdata_management
 
Managing sensitive data in performing arts
Managing sensitive data in performing artsManaging sensitive data in performing arts
Managing sensitive data in performing artsdata_management
 

Más de data_management (9)

RCAHMS digital archive
RCAHMS digital archiveRCAHMS digital archive
RCAHMS digital archive
 
ScotGrid
ScotGridScotGrid
ScotGrid
 
Share point7mar11
Share point7mar11Share point7mar11
Share point7mar11
 
Experiences (mis)managing archaeological data
Experiences (mis)managing archaeological dataExperiences (mis)managing archaeological data
Experiences (mis)managing archaeological data
 
RDO support
RDO supportRDO support
RDO support
 
Managing music data
Managing music dataManaging music data
Managing music data
 
Support in TFTS, Glasgow
Support in TFTS, GlasgowSupport in TFTS, Glasgow
Support in TFTS, Glasgow
 
Providing technical support during research projects
Providing technical support during research projectsProviding technical support during research projects
Providing technical support during research projects
 
Managing sensitive data in performing arts
Managing sensitive data in performing artsManaging sensitive data in performing arts
Managing sensitive data in performing arts
 

Último

GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationRosabel UA
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxleah joy valeriano
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 

Último (20)

GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Activity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translationActivity 2-unit 2-update 2024. English translation
Activity 2-unit 2-update 2024. English translation
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 

Managing large and complex data sets

  • 1. Managing large and complex data sets: … THE CHALLENGES OF ARCHIVING AND ONLINE DELIVERY CATHERINE HARDMAN
  • 2. The problem….in 1996 My lithics report here, on floppy disc
  • 3.
  • 4.
  • 5. No need for digital preservation Domesday Book: Publisher: William of Normandy (1086) – still readable
  • 6. Where’s preservation when you need it? Domesday Disc: Publisher: BBC (1986) –nearly lost
  • 7. Why is it important?
  • 8.
  • 9. The scale of the problem in the 1990s Strategies for protecting physical media Findings and Recommendations from ‘Digital Data in Archaeology: A Survey of User Needs’ Condron et al 1999
  • 10. Protecting Physical media … never the twain
  • 11. The scale of the problem in the 1990s The popularity of storage options Findings and Recommendations from ‘Digital Data in Archaeology: A Survey of User Needs’ Condron et al 1999
  • 12. 8" Floppy 3.5" Floppy 5.25" Floppy 12" Optical Disk 5.25" Optical Disk CD-ROM Sparq Disk Cartridge Zip Disk Click! DVD-ROM Jaz Disk Floptical Disk Punch Tape Rectangular Hole Punch Card IBM 3480 DLT Tape DG90M Tape DC4_120 8mmD-eight QIC DC600 G2000 Tape 4mm Tape Ditto Max 9-Track Ree l Cassette tape         Memory Stick MultiMedia Card SD Memory Card xD Picture Card Smart Media CompactFlash Travan
  • 13.
  • 14. How do we do it? Open Archival Information System (OAIS)
  • 16. Migration based approach & controlled ingest Aim to connect with data producers early on in their project lifecycles to ensure that preservation planning is a key consideration during the project rather than an afterthought.
  • 17. Guides to help you do all that.
  • 18.
  • 19. The size of digital archives held by different types of archaeological bodies http://ads.ahds.ac.uk/ A rchaeology D ata S ervice
  • 20. Big Data Project Roughly how much data would be generated by a single project?
  • 21. Which of these data collection techniques do you carry out? Technologies used 12% 4% 4% 3% 8% 1% 3% 11% 9% 9% 7% 14% 3% 12% 3D Laser Scanning Sidescan Sonar Multibeam Scanning Single Beam Scanning Geophysics Acoustic Tracking Sub bottom profiling Geographic (eg GIS) Lidar Digital Video Video Movie Clips Still Images CAD (2D or 3D) Other
  • 22. What are the main software packages you use ?
  • 23. Do you have an archiving policy for the data sets / types in question?
  • 25. When you start a new project …would you consider using existing datasets?
  • 26. This is the opportunity!
  • 27.  
  • 28.
  • 29. Blurring the distinction … … between publication and archives …
  • 31.  
  • 32.
  • 33.
  • 34.

Notas del editor

  1. How big is your data? – asked in order to get a idea of scale of the problem So you’ll see there is some quite big data being produced out there – some people producing over 200GB for a project
  2. We ran an online questionnaire to find out about users and uses of big data – I’ll just skim through some of the things that came out of it: We got 48 responses. this is one of the first questions we asked. Wanted to get an idea of the data collection techniques that people are using to create big data. You’ll see there’s a wide range of technologies including the ones I mentioned on an earlier slide.
  3. Of the 101 software packages entered into the online form a staggering 52 are unique (that is after editing for things like lower and upper case character differences). It seems the world of ‘big data’ is very fragmented.
  4. This is an interesting one. We asked if people had an archival policy for the data sets in question. Only 48% of respondents note that they have a policy in place Of these many noted that these policies were localised and incomplete - not formal written policy. A proper system of digital archiving should involve continuous active management of the data, putting data on a dvd and putting it in a drawer is not really a stable archival policy. A formal archival policy as we see it should ideally be based on the OAIS system – continuous active management of data to ensure its survival into the future.
  5. Overwhelming “yes” to this question.... Some of the reasons that were cited: monitoring over time avoiding duplication Saving time/money Of course – re-use just isn’t possible unless someone is archiving and providing access to this data