SlideShare una empresa de Scribd logo
1 de 41
Descargar para leer sin conexión
Taming
                                                      the Monster
                                                      Digital Preservation Planning
                                                      and Implementation Tools



                                                                         Dorothea Salo
Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/             One System, One Library
WorldIslandInfo.com / CC-BY 2.0
                                                                           2 June 2011
Why is this
                                                       so scary?


Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
Isn’t this just
  as scary?




Photo: “News Paper Origami Dragon Monster”
http://www.flickr.com/photos/epsos/3777343342/
epSos.de / CC-BY 2.0
Yet we
  persevere.




Photo: “News Paper Origami Dragon Monster”
http://www.flickr.com/photos/epsos/3777343342/
epSos.de / CC-BY 2.0
DIGITAL IS NO
                                   DIFFERENT.


Photo: “559 - The Matrix - Seamless Texture”
http://www.flickr.com/photos/zooboing/4335531915/
Patrick Hoesly / CC-BY 2.0
Many of the same ideas apply...
           • Planning and policy
           • Risk assessment
           • Risk management
                  • (knowing that we can’t save everything)
           • Materials quality matters!
           • Problem discovery and remediation
           • Crisis management
           • Chief problems: staff, $$$, organizational
             commitment
Photo: “Where I Teach”
http://www.flickr.com/photos/eklektikos/2541408630/
Todd Ehlers / CC-BY 2.0
Planning and
                                                      assessment
                                                             tools

Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
Scene-setting

           • Rosenthal, David. “Requirements for Digital
             Preservation: a Bottom-Up Approach.”
                  • http://www.dlib.org/dlib/november05/rosenthal/
                    11rosenthal.html
           • If you’re new to this, or trying to find your
             feet, this is the best short introduction I
             know.
                  • The list of threats is outstanding.

Photo: “Bottoms Up! - Duck; San Anton Gardens, Malta”
http://www.flickr.com/photos/foxypar4/3123113762/
John Haslam / CC-BY 2.0
TRAC
• “Trusted Repository Audit Checklist”
• Despite the name, covers a LOT more than
  the technology!




                                             !
  • Budget
  • Staffing
  • “designated communities”
• CRL will audit you, if you like
  • (don’t, unless you’re really serious!)
• http://catalog.crl.edu/record=b2212602~S1
DRAMBORA
• Digital Repository Audit Method Based on
  Risk Assessment
• A “self-test,” if you will.
  • DRAMBORA is equally good as a pre- or post-test.
• Personally, I prefer DRAMBORA to TRAC,




                                                 !
  especially for those just starting out.
• http://www.repositoryaudit.eu/
  • (registration required for toolkit access)
Coping with
                                                      file formats

Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
The one acronym you
  need to know: FITS
• “File Information Tool Set”
  • (you need to know this; otherwise it’s hard to Google)
• Wrapper for several file-format detector
  software packages
• Intended to be baked into other software
• It’s early days yet!
  • (This means you can’t always trust what the tools tell
    you, especially when they’re telling you about errors.)
What’s this file?

• wotsit.org “The Programmer’s File and
  Data Resource”
• Directory of file extensions
• When in doubt: open in a browser or text
  editor and see what you get.
  • N.b.: Microsoft Word is NOT a text editor!
Solving the
                                                      geographic
                                                      distribution
                                                      problem

Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
What problem, now?
           • The “all your eggs in one basket” problem.
                  • If all your bits are on one server, and the server room
                    is flooded, or your town is nuked—oops.
           • Not the same as backups!
                  • Don’t get me wrong, backups are important!
                  • Backups are SHORT-TERM, and usually LOCAL.
                    Geographic distribution (plus associated auditing) is
                    intended for the long term.
                  • Don’t forget auditing!
Photo: “Nido”
http://www.flickr.com/photos/italintheheart/3679974298/
Jorge Elías / CC-BY 2.0
LOCKSS
• Lots of Copies Keeps Stuff Safe!
  • (There is also Portico, but Portico only works with
    e‑journal content.)
  • Open-source software that handles replication and
    (some) auditing.
• “Private LOCKSS network”
  • A group of institutions agrees to build a LOCKSS
    network just for the stuff they’re interested in.
  • ASERL does this for ETDs. Many institutions
    (including UW-Madison) participate in a PLN for
    govdocs.
“The cloud”
       • Typical cloud-based storage services make
         NO promises they won’t lose your stuff.
              • And for large quantities of data, bandwidth can become
                an issue.
              • And can they look at your stuff? Should they be able to?
       • Some early movers in this market fading
              • Iron Mountain had to kill their service.
       • DuraCloud
              • trying to finesse this issue by negotiating tougher SLAs
                with cloud-storage providers
Photo: “Sky View From Humboldt Park”
http://www.flickr.com/photos/purpleslog/2589612577/
Purple Slog / CC-BY 2.0
Repository
                                                      and digital-library
                                                      platforms

Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
Friendly word
of advice:



PICK
SOFTWARE
LAST.                   Photo: “Briana Calderon; future educator of america.”
                http://www.flickr.com/photos/46132085@N03/4703617843/
                                                 Arielle Calderon / CC-BY 2.0
Another friendly word of
                    advice:


                            DON’T CHASE
                             THE SHINY.

Photo: “Sparkle Texture”
http://www.flickr.com/photos/abbylanes/3214921616/
Abby Lane / CC-BY 2.0
Digital-library software
         • Is almost always VERY BAD at digital
           preservation!
                • (most packages don’t even try!)
                • So if a file gets corrupted on the server, or whatever...
                  no warnings, no restore, nothing. Also, provenance?
                  Who needs provenance? Event tracking? What’s that?
         • I’m not saying don’t use it. I’m saying that
           it doesn’t solve this problem.
                • In fact, if you’re using this software, you need to solve
                  this problem FOR IT.
Photo: “National DIGITAL Library”
http://www.flickr.com/photos/schex/193912573/
Jesse Schexnayder / CC-BY 2.0
Examples


• ContentDM: http://contentdm.com/
• Omeka: http://omeka.org/
• Greenstone: http://greenstone.org/
Institutional-repository
                     software

        • Is SHOCKINGLY bad at digital preservation!
              • (Though sometimes better than most DL software.)
        • Examples
              • Hosted/commercial: Digital Commons (BePress),
                ContentDM, DigiTool
              • If you go hosted, you’d better ask about their digital-
                preservation practices!
              • Open-source: EPrints, DSpace, Fedora
Photo: “IMG_0668”
http://www.flickr.com/photos/12967790@N00/66531124
Robert / CC-BY 2.0
A new approach:
                                                      curation
                                                      microservices

Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
Do we really need




Photo: “giant crystal blob”
http://www.flickr.com/photos/a_of_doom/527905701/
A of DooM / CC-BY 2.0
                                                   THE BLOB?
How about a jigsaw
          puzzle instead?
             • Break the digital-preservation problem
               down into parts.
             • Code up each part, making sure that it
               plays nicely with other parts.
                    • lots of nice APIs!
                    • which means other software can adopt/adapt
                      microservices as well!
             • Put parts together as you need them.
Photo: “Lapsana Apogonoides Puzzle”
http://www.flickr.com/photos/gdesigneralex/2313092112/
gdesigneralex / CC-BY 2.0
California Digital Library


• Pioneering this approach
• Has open-sourced code for microservices
• Has added microservices together to build
  its “Merritt” storage/repository service
Escaping the silos:
                                                      Fedora Commons

Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
What is Fedora Commons?
• Blueprints and foundation, not the whole
  house (analogy credit to Peter Gorman)
• You build the house you want!
• Or you build condominiums on the same
  foundation.
  • Need different user interfaces for different materials?
  • Need different structures and behaviors?
  • No problem! Fedora can handle that.
• (have I run this analogy into the ground yet?)
We had this...




                 Diagram courtesy of Peter Gorman.
We are building this.




                 Diagram courtesy of Peter Gorman.
E-records
                                                      management

Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
Axioms
• Records management is
  about policy and
  procedures.
  • If your policy doesn’t fit with
    their procedures, guess what
    wins? Choose battles wisely.
• There is never enough
  storage space.
• Nobody cares until
  there’s a crisis.
• Software will not save
  you... but it might help!
                             Photo: “The Never Ending Math Problem”
     http://www.flickr.com/photos/acidwashphotography/2967752733/
                                                 d3 Dan / CC-BY 2.0
Duke Data Accessioner

• Accessioning tool for digital data
  • use case: J. Important Scholar dumps her hard drive
    on your desk, expects you to cope
• File migrator, metadata manager, GUI,
  plugins (e.g. for file-format detection)
• Bit rough, but in production use.
  • http://library.duke.edu/uarchives/about/tools/data-
    accessioner.html
Archivematica

• Soup-to-nuts records management and
  digital preservation tool.
  • Evaluation and accessioning all the way through
    preservation actions. (Oddly, they seem to be
    missing disposal... but they’re in alpha, so...)
• Open source
  • Runs on a Linux server; RMs and archivists log in to
    GUI application remotely.
• Normally I hate and fear silos, but this one
  is smartly built on microservices.
Practical E-Records
• Weblog by Chris Prom and protegés
• Tool evaluations, conference-session
  writeups, essays on praxis
• Best reading out there for the do-it-
  yourselfer
• If you’re not reading it, why not?
• http://e-records.chrisprom.com/
Last thoughts

Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0
If you can’t do everything...




                   Image: “Confused”
                   http://www.flickr.com/photos/kristiand/3223044657/
                   Kristian D. / CC-BY 2.0




  that’s okay. Who can?
DO SOMETHING.




Photo: “Came hame háááá!”
http://www.flickr.com/photos/kristiand/3223044657/
Guirí R. Reyes / CC-BY 2.0
The worst threat?




INACTION.                           Photo: “Fatty’s role model”
           http://www.flickr.com/photos/cloudzilla/4910616774/
                                         cloudzilla / CC-BY 2.0
Thank you!
                                                      This presentation is available
                                                      under a Creative Commons 3.0
                                                      United States license.
Photo: “Happy Easter, to my Peeps”
http://www.flickr.com/photos/76074333@N00/449028423/
WorldIslandInfo.com / CC-BY 2.0

Más contenido relacionado

Similar a Taming the Monster: Digital Preservation Planning and Implementation Tools

"The evolution of mobile apps". Alan Cannistraro, Facebook
"The evolution of mobile apps". Alan Cannistraro, Facebook"The evolution of mobile apps". Alan Cannistraro, Facebook
"The evolution of mobile apps". Alan Cannistraro, FacebookYandex
 
Grab a bucket! It's raining data!
Grab a bucket! It's raining data!Grab a bucket! It's raining data!
Grab a bucket! It's raining data!Dorothea Salo
 
Welcome to planet Fintlewoodlewix - SmashingConf NYC 2014
Welcome to planet Fintlewoodlewix - SmashingConf NYC 2014Welcome to planet Fintlewoodlewix - SmashingConf NYC 2014
Welcome to planet Fintlewoodlewix - SmashingConf NYC 2014Christian Heilmann
 
Paul Stokes (Jisc) - A provocation about preservation
Paul Stokes (Jisc) - A provocation about preservationPaul Stokes (Jisc) - A provocation about preservation
Paul Stokes (Jisc) - A provocation about preservationCLOCKSS
 
Canoe the Open Content Rapids
Canoe the Open Content RapidsCanoe the Open Content Rapids
Canoe the Open Content RapidsDorothea Salo
 
Paraimpu: a social tool for the Web of Things
Paraimpu: a social tool for the Web of ThingsParaimpu: a social tool for the Web of Things
Paraimpu: a social tool for the Web of ThingsAntonio Pintus
 
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)ux singapore
 
The Seven Wastes of Software Development
The Seven Wastes of Software DevelopmentThe Seven Wastes of Software Development
The Seven Wastes of Software DevelopmentMatt Stine
 
The Next Big Thing is Web 3.0. Catch It If You Can
The Next Big Thing is Web 3.0. Catch It If You Can The Next Big Thing is Web 3.0. Catch It If You Can
The Next Big Thing is Web 3.0. Catch It If You Can Judy O'Connell
 
From Virtual Reality to Blockchain: Current and Emerging Tech Trends
From Virtual Reality to Blockchain: Current and Emerging Tech TrendsFrom Virtual Reality to Blockchain: Current and Emerging Tech Trends
From Virtual Reality to Blockchain: Current and Emerging Tech TrendsBohyun Kim
 
Building Your Future by Building Your Staff
Building Your Future by Building Your StaffBuilding Your Future by Building Your Staff
Building Your Future by Building Your Staffjamzak
 
Course Tech 2013, Gina M. Bowers-Miller, Using Mobile Technology in the Class...
Course Tech 2013, Gina M. Bowers-Miller, Using Mobile Technology in the Class...Course Tech 2013, Gina M. Bowers-Miller, Using Mobile Technology in the Class...
Course Tech 2013, Gina M. Bowers-Miller, Using Mobile Technology in the Class...Cengage Learning
 
Connect, Communicate, Collaborate: Powering Learning
Connect, Communicate, Collaborate: Powering LearningConnect, Communicate, Collaborate: Powering Learning
Connect, Communicate, Collaborate: Powering LearningJudy O'Connell
 
Storing Your Research Data
Storing Your Research DataStoring Your Research Data
Storing Your Research DataKristin Briney
 
Course tech conf_2013_ppt_mobile_technology_bowers-miller
Course tech conf_2013_ppt_mobile_technology_bowers-millerCourse tech conf_2013_ppt_mobile_technology_bowers-miller
Course tech conf_2013_ppt_mobile_technology_bowers-millerGina Bowers-Miller
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsJon Voss
 

Similar a Taming the Monster: Digital Preservation Planning and Implementation Tools (20)

"The evolution of mobile apps". Alan Cannistraro, Facebook
"The evolution of mobile apps". Alan Cannistraro, Facebook"The evolution of mobile apps". Alan Cannistraro, Facebook
"The evolution of mobile apps". Alan Cannistraro, Facebook
 
Linked data in action
Linked data in actionLinked data in action
Linked data in action
 
Grab a bucket! It's raining data!
Grab a bucket! It's raining data!Grab a bucket! It's raining data!
Grab a bucket! It's raining data!
 
Welcome to planet Fintlewoodlewix - SmashingConf NYC 2014
Welcome to planet Fintlewoodlewix - SmashingConf NYC 2014Welcome to planet Fintlewoodlewix - SmashingConf NYC 2014
Welcome to planet Fintlewoodlewix - SmashingConf NYC 2014
 
Paul Stokes (Jisc) - A provocation about preservation
Paul Stokes (Jisc) - A provocation about preservationPaul Stokes (Jisc) - A provocation about preservation
Paul Stokes (Jisc) - A provocation about preservation
 
Canoe the Open Content Rapids
Canoe the Open Content RapidsCanoe the Open Content Rapids
Canoe the Open Content Rapids
 
Paraimpu: a social tool for the Web of Things
Paraimpu: a social tool for the Web of ThingsParaimpu: a social tool for the Web of Things
Paraimpu: a social tool for the Web of Things
 
Ldl2012
Ldl2012Ldl2012
Ldl2012
 
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
UXSG2014 Lightning Talks - Selfish accessibility (Adrian Roselli)
 
The Seven Wastes of Software Development
The Seven Wastes of Software DevelopmentThe Seven Wastes of Software Development
The Seven Wastes of Software Development
 
The Next Big Thing is Web 3.0. Catch It If You Can
The Next Big Thing is Web 3.0. Catch It If You Can The Next Big Thing is Web 3.0. Catch It If You Can
The Next Big Thing is Web 3.0. Catch It If You Can
 
From Virtual Reality to Blockchain: Current and Emerging Tech Trends
From Virtual Reality to Blockchain: Current and Emerging Tech TrendsFrom Virtual Reality to Blockchain: Current and Emerging Tech Trends
From Virtual Reality to Blockchain: Current and Emerging Tech Trends
 
Just Digitise It! - Daniel Wilksch
Just Digitise It! - Daniel WilkschJust Digitise It! - Daniel Wilksch
Just Digitise It! - Daniel Wilksch
 
Building Your Future by Building Your Staff
Building Your Future by Building Your StaffBuilding Your Future by Building Your Staff
Building Your Future by Building Your Staff
 
Course Tech 2013, Gina M. Bowers-Miller, Using Mobile Technology in the Class...
Course Tech 2013, Gina M. Bowers-Miller, Using Mobile Technology in the Class...Course Tech 2013, Gina M. Bowers-Miller, Using Mobile Technology in the Class...
Course Tech 2013, Gina M. Bowers-Miller, Using Mobile Technology in the Class...
 
Connect, Communicate, Collaborate: Powering Learning
Connect, Communicate, Collaborate: Powering LearningConnect, Communicate, Collaborate: Powering Learning
Connect, Communicate, Collaborate: Powering Learning
 
Storing Your Research Data
Storing Your Research DataStoring Your Research Data
Storing Your Research Data
 
Course tech conf_2013_ppt_mobile_technology_bowers-miller
Course tech conf_2013_ppt_mobile_technology_bowers-millerCourse tech conf_2013_ppt_mobile_technology_bowers-miller
Course tech conf_2013_ppt_mobile_technology_bowers-miller
 
Smashingconf nyc-final
Smashingconf nyc-finalSmashingconf nyc-final
Smashingconf nyc-final
 
Linked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & MuseumsLinked Open Data in Libraries, Archives & Museums
Linked Open Data in Libraries, Archives & Museums
 

Más de Dorothea Salo

Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)Dorothea Salo
 
Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!Dorothea Salo
 
Privacy and libraries
Privacy and librariesPrivacy and libraries
Privacy and librariesDorothea Salo
 
Risk management and auditing
Risk management and auditingRisk management and auditing
Risk management and auditingDorothea Salo
 
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)Dorothea Salo
 
Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?Dorothea Salo
 
MARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archivesMARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archivesDorothea Salo
 
Research Data and Scholarly Communication
Research Data and Scholarly CommunicationResearch Data and Scholarly Communication
Research Data and Scholarly CommunicationDorothea Salo
 
Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)Dorothea Salo
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing SerendipityDorothea Salo
 
RDF, RDA, and other TLAs
RDF, RDA, and other TLAsRDF, RDA, and other TLAs
RDF, RDA, and other TLAsDorothea Salo
 
I own copyright, so I pwn you!
I own copyright, so I pwn you!I own copyright, so I pwn you!
I own copyright, so I pwn you!Dorothea Salo
 
Librarians love data!
Librarians love data!Librarians love data!
Librarians love data!Dorothea Salo
 
Avoiding the Heron's Way
Avoiding the Heron's WayAvoiding the Heron's Way
Avoiding the Heron's WayDorothea Salo
 

Más de Dorothea Salo (20)

Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)Soylent Semantic Web Is People! (with notes)
Soylent Semantic Web Is People! (with notes)
 
Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!Soylent SemanticWeb Is People!
Soylent SemanticWeb Is People!
 
Encryption
EncryptionEncryption
Encryption
 
Privacy and libraries
Privacy and librariesPrivacy and libraries
Privacy and libraries
 
Paying for it
Paying for itPaying for it
Paying for it
 
Risk management and auditing
Risk management and auditingRisk management and auditing
Risk management and auditing
 
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
The Canonically Bad (Digital) Humanities Proposal (and how to avoid it)
 
Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?Is this BIG DATA which I see before me?
Is this BIG DATA which I see before me?
 
MARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archivesMARC and BIBFRAME; Linking libraries and archives
MARC and BIBFRAME; Linking libraries and archives
 
Library Linked Data
Library Linked DataLibrary Linked Data
Library Linked Data
 
FRBR and RDA
FRBR and RDAFRBR and RDA
FRBR and RDA
 
Research Data and Scholarly Communication
Research Data and Scholarly CommunicationResearch Data and Scholarly Communication
Research Data and Scholarly Communication
 
Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)Research Data and Scholarly Communication (with notes)
Research Data and Scholarly Communication (with notes)
 
Manufacturing Serendipity
Manufacturing SerendipityManufacturing Serendipity
Manufacturing Serendipity
 
What We Organize
What We OrganizeWhat We Organize
What We Organize
 
Occupy Copyright!
Occupy Copyright!Occupy Copyright!
Occupy Copyright!
 
RDF, RDA, and other TLAs
RDF, RDA, and other TLAsRDF, RDA, and other TLAs
RDF, RDA, and other TLAs
 
I own copyright, so I pwn you!
I own copyright, so I pwn you!I own copyright, so I pwn you!
I own copyright, so I pwn you!
 
Librarians love data!
Librarians love data!Librarians love data!
Librarians love data!
 
Avoiding the Heron's Way
Avoiding the Heron's WayAvoiding the Heron's Way
Avoiding the Heron's Way
 

Último

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Último (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Taming the Monster: Digital Preservation Planning and Implementation Tools

  • 1. Taming the Monster Digital Preservation Planning and Implementation Tools Dorothea Salo Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/ One System, One Library WorldIslandInfo.com / CC-BY 2.0 2 June 2011
  • 2. Why is this so scary? Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 3. Isn’t this just as scary? Photo: “News Paper Origami Dragon Monster” http://www.flickr.com/photos/epsos/3777343342/ epSos.de / CC-BY 2.0
  • 4. Yet we persevere. Photo: “News Paper Origami Dragon Monster” http://www.flickr.com/photos/epsos/3777343342/ epSos.de / CC-BY 2.0
  • 5. DIGITAL IS NO DIFFERENT. Photo: “559 - The Matrix - Seamless Texture” http://www.flickr.com/photos/zooboing/4335531915/ Patrick Hoesly / CC-BY 2.0
  • 6. Many of the same ideas apply... • Planning and policy • Risk assessment • Risk management • (knowing that we can’t save everything) • Materials quality matters! • Problem discovery and remediation • Crisis management • Chief problems: staff, $$$, organizational commitment Photo: “Where I Teach” http://www.flickr.com/photos/eklektikos/2541408630/ Todd Ehlers / CC-BY 2.0
  • 7. Planning and assessment tools Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 8. Scene-setting • Rosenthal, David. “Requirements for Digital Preservation: a Bottom-Up Approach.” • http://www.dlib.org/dlib/november05/rosenthal/ 11rosenthal.html • If you’re new to this, or trying to find your feet, this is the best short introduction I know. • The list of threats is outstanding. Photo: “Bottoms Up! - Duck; San Anton Gardens, Malta” http://www.flickr.com/photos/foxypar4/3123113762/ John Haslam / CC-BY 2.0
  • 9. TRAC • “Trusted Repository Audit Checklist” • Despite the name, covers a LOT more than the technology! ! • Budget • Staffing • “designated communities” • CRL will audit you, if you like • (don’t, unless you’re really serious!) • http://catalog.crl.edu/record=b2212602~S1
  • 10. DRAMBORA • Digital Repository Audit Method Based on Risk Assessment • A “self-test,” if you will. • DRAMBORA is equally good as a pre- or post-test. • Personally, I prefer DRAMBORA to TRAC, ! especially for those just starting out. • http://www.repositoryaudit.eu/ • (registration required for toolkit access)
  • 11. Coping with file formats Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 12. The one acronym you need to know: FITS • “File Information Tool Set” • (you need to know this; otherwise it’s hard to Google) • Wrapper for several file-format detector software packages • Intended to be baked into other software • It’s early days yet! • (This means you can’t always trust what the tools tell you, especially when they’re telling you about errors.)
  • 13. What’s this file? • wotsit.org “The Programmer’s File and Data Resource” • Directory of file extensions • When in doubt: open in a browser or text editor and see what you get. • N.b.: Microsoft Word is NOT a text editor!
  • 14. Solving the geographic distribution problem Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 15. What problem, now? • The “all your eggs in one basket” problem. • If all your bits are on one server, and the server room is flooded, or your town is nuked—oops. • Not the same as backups! • Don’t get me wrong, backups are important! • Backups are SHORT-TERM, and usually LOCAL. Geographic distribution (plus associated auditing) is intended for the long term. • Don’t forget auditing! Photo: “Nido” http://www.flickr.com/photos/italintheheart/3679974298/ Jorge Elías / CC-BY 2.0
  • 16. LOCKSS • Lots of Copies Keeps Stuff Safe! • (There is also Portico, but Portico only works with e‑journal content.) • Open-source software that handles replication and (some) auditing. • “Private LOCKSS network” • A group of institutions agrees to build a LOCKSS network just for the stuff they’re interested in. • ASERL does this for ETDs. Many institutions (including UW-Madison) participate in a PLN for govdocs.
  • 17. “The cloud” • Typical cloud-based storage services make NO promises they won’t lose your stuff. • And for large quantities of data, bandwidth can become an issue. • And can they look at your stuff? Should they be able to? • Some early movers in this market fading • Iron Mountain had to kill their service. • DuraCloud • trying to finesse this issue by negotiating tougher SLAs with cloud-storage providers Photo: “Sky View From Humboldt Park” http://www.flickr.com/photos/purpleslog/2589612577/ Purple Slog / CC-BY 2.0
  • 18. Repository and digital-library platforms Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 19. Friendly word of advice: PICK SOFTWARE LAST. Photo: “Briana Calderon; future educator of america.” http://www.flickr.com/photos/46132085@N03/4703617843/ Arielle Calderon / CC-BY 2.0
  • 20. Another friendly word of advice: DON’T CHASE THE SHINY. Photo: “Sparkle Texture” http://www.flickr.com/photos/abbylanes/3214921616/ Abby Lane / CC-BY 2.0
  • 21. Digital-library software • Is almost always VERY BAD at digital preservation! • (most packages don’t even try!) • So if a file gets corrupted on the server, or whatever... no warnings, no restore, nothing. Also, provenance? Who needs provenance? Event tracking? What’s that? • I’m not saying don’t use it. I’m saying that it doesn’t solve this problem. • In fact, if you’re using this software, you need to solve this problem FOR IT. Photo: “National DIGITAL Library” http://www.flickr.com/photos/schex/193912573/ Jesse Schexnayder / CC-BY 2.0
  • 22. Examples • ContentDM: http://contentdm.com/ • Omeka: http://omeka.org/ • Greenstone: http://greenstone.org/
  • 23. Institutional-repository software • Is SHOCKINGLY bad at digital preservation! • (Though sometimes better than most DL software.) • Examples • Hosted/commercial: Digital Commons (BePress), ContentDM, DigiTool • If you go hosted, you’d better ask about their digital- preservation practices! • Open-source: EPrints, DSpace, Fedora Photo: “IMG_0668” http://www.flickr.com/photos/12967790@N00/66531124 Robert / CC-BY 2.0
  • 24. A new approach: curation microservices Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 25. Do we really need Photo: “giant crystal blob” http://www.flickr.com/photos/a_of_doom/527905701/ A of DooM / CC-BY 2.0 THE BLOB?
  • 26. How about a jigsaw puzzle instead? • Break the digital-preservation problem down into parts. • Code up each part, making sure that it plays nicely with other parts. • lots of nice APIs! • which means other software can adopt/adapt microservices as well! • Put parts together as you need them. Photo: “Lapsana Apogonoides Puzzle” http://www.flickr.com/photos/gdesigneralex/2313092112/ gdesigneralex / CC-BY 2.0
  • 27. California Digital Library • Pioneering this approach • Has open-sourced code for microservices • Has added microservices together to build its “Merritt” storage/repository service
  • 28. Escaping the silos: Fedora Commons Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 29. What is Fedora Commons? • Blueprints and foundation, not the whole house (analogy credit to Peter Gorman) • You build the house you want! • Or you build condominiums on the same foundation. • Need different user interfaces for different materials? • Need different structures and behaviors? • No problem! Fedora can handle that. • (have I run this analogy into the ground yet?)
  • 30. We had this... Diagram courtesy of Peter Gorman.
  • 31. We are building this. Diagram courtesy of Peter Gorman.
  • 32. E-records management Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 33. Axioms • Records management is about policy and procedures. • If your policy doesn’t fit with their procedures, guess what wins? Choose battles wisely. • There is never enough storage space. • Nobody cares until there’s a crisis. • Software will not save you... but it might help! Photo: “The Never Ending Math Problem” http://www.flickr.com/photos/acidwashphotography/2967752733/ d3 Dan / CC-BY 2.0
  • 34. Duke Data Accessioner • Accessioning tool for digital data • use case: J. Important Scholar dumps her hard drive on your desk, expects you to cope • File migrator, metadata manager, GUI, plugins (e.g. for file-format detection) • Bit rough, but in production use. • http://library.duke.edu/uarchives/about/tools/data- accessioner.html
  • 35. Archivematica • Soup-to-nuts records management and digital preservation tool. • Evaluation and accessioning all the way through preservation actions. (Oddly, they seem to be missing disposal... but they’re in alpha, so...) • Open source • Runs on a Linux server; RMs and archivists log in to GUI application remotely. • Normally I hate and fear silos, but this one is smartly built on microservices.
  • 36. Practical E-Records • Weblog by Chris Prom and protegés • Tool evaluations, conference-session writeups, essays on praxis • Best reading out there for the do-it- yourselfer • If you’re not reading it, why not? • http://e-records.chrisprom.com/
  • 37. Last thoughts Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0
  • 38. If you can’t do everything... Image: “Confused” http://www.flickr.com/photos/kristiand/3223044657/ Kristian D. / CC-BY 2.0 that’s okay. Who can?
  • 39. DO SOMETHING. Photo: “Came hame háááá!” http://www.flickr.com/photos/kristiand/3223044657/ Guirí R. Reyes / CC-BY 2.0
  • 40. The worst threat? INACTION. Photo: “Fatty’s role model” http://www.flickr.com/photos/cloudzilla/4910616774/ cloudzilla / CC-BY 2.0
  • 41. Thank you! This presentation is available under a Creative Commons 3.0 United States license. Photo: “Happy Easter, to my Peeps” http://www.flickr.com/photos/76074333@N00/449028423/ WorldIslandInfo.com / CC-BY 2.0