SlideShare una empresa de Scribd logo
1 de 24
Measure All the
(Web Archiving) Things!
Nicholas Taylor
Web Archiving Service Manager
Stanford University Libraries
Archive-It Partner Meeting
August 18, 2015
how many more websites are we archiving?
“Library_01.jpg” by British Library
crawl report list
Archive-It: “Crawls for Account #198”
seeds for individual crawl
Archive-It: “Seeds for Crawl #99435”
download seed list
Archive-It: “Seeds for Crawl #99435”
downloaded seed list
whew, that was easy!
oh, wait a minute…
seed lists are per crawl
well, how many crawls are there?
• 6 accounts
• oldest active since 2007
• 30+ collections
• hundreds of crawls
count and average not enough
• seeds move in and out of
crawls
• seeds have different
frequencies
• new seeds w/ new URLs
for old seeds
• “university website” is
many seeds
plus
• non Archive-It web
archiving activity
“Dichotomic Maples” by francoismi under CC BY-NC-SA 2.0
“what gets measured, gets managed”
“Gudauri still life” by Carsten ten Brink under CC BY-NC-ND 2.0
why measure?
• advocacy/outreach
• service modeling
• program assessment
• policy making
• staffing assessment
• grant support
• prioritization
• risk assessment “Measuring river depth” by epeirogenic under CC BY-NC 2.0
what to measure?
• How to handle the data volume?
• What is the usage of web archives?
• How much does web archiving cost?
• How to assure the quality of archived content?
• How to secure institutional buy-in?
• How much loss have resources suffered?
• What is the impact of policy requirements?
community-valued metrics
0%
10%
20%
30%
40%
50%
60%
Volume Usage Cost Quality Buy-in Loss Policy
Percentage of organizations
NDSA: “Web Archiving in the United States: a 2013 Survey”
volume
• websites
– captured
– preserved
– described
• data
– captured
– preserved
• objects
– captured
– preserved “typography jumble” by Bill Dickinson under CC BY-NC 2.0
usage
• web analytics
– visitors
– visits
– referers
• actual use cases
(who + how many?)
– research
– teaching
– institutional legacy
– compliance
“113/365 Days: A page from my heart” by LaughingRhoda under CC BY-NC-ND 2.0
cost
• external
– out-payments for web
archiving services
– quota utilization
• internal
– staff time, by activity
– storage “Largest square from a dollar bill” by origami_madness under CC BY-NC 2.0
performance
• accessioning
throughput
• service request
turnaround
• collections/websites
w/ discovery records
• time to regenerate
full-text index
“Lower rack” by Andy Melton under CC BY-SA 2.0
community-valued…metrics?
0%
10%
20%
30%
40%
50%
60%
Volume Usage Cost Quality Buy-in Loss Policy
Percentage of organizations
NDSA: “Web Archiving in the United States: a 2013 Survey”
“not everything that counts can be counted”
“Ten Floods, Twenty-Five Trees, Nineteen Bubbles...” by Flood G. under CC BY-NC-ND 2.0
quality
• use case-specific?
• benchmark to ideal or
to limits of tools?
• quantifiable metrics?
• existing metrics as
proxies for quality?
• sampling approach?
• not just missing content
but also collected junk
NYARC: “I. Introduction - NYARC Documentation”
buy-in
• unique nominators?
• projects w/ web archiving
component?
• budgetary commitments?
• resource commitments?
• charge for service?
• testimonials?
“The Play” by Ryan Hyde under CC BY-SA 2.0
loss
UK Web Archive: “Ten years of the UK Web Archive: What have we saved?”
policy
• first capture under
embargo
• opt-out requests
• takedown requests
• external environment
“We apologise for any convenience - Update” by Alan Stanton under CC BY-SA 2.0
better measures, measuring better
“Line Art Project #2 VIS3 UCSD” by Mandy Jouan under CC BY-NC-ND 2.0

Más contenido relacionado

Destacado

"Archive What I See Now" - NEH ODH overview
"Archive What I See Now" - NEH ODH overview"Archive What I See Now" - NEH ODH overview
"Archive What I See Now" - NEH ODH overviewMichele Weigle
 
Who and What Links to the Internet Archive
Who and What Links to the Internet ArchiveWho and What Links to the Internet Archive
Who and What Links to the Internet ArchiveMichael Nelson
 
What can linked data do for digital libraries
What can linked data do for digital librariesWhat can linked data do for digital libraries
What can linked data do for digital librariesSören Auer
 
The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...Brian Solis
 
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...nullhandle
 
Considerations for Strategic Web Archive Collection Development
Considerations for Strategic Web Archive Collection DevelopmentConsiderations for Strategic Web Archive Collection Development
Considerations for Strategic Web Archive Collection Developmentnullhandle
 
Building Web Archiving Technology, Together
Building Web Archiving Technology, TogetherBuilding Web Archiving Technology, Together
Building Web Archiving Technology, Togethernullhandle
 
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS ProgramLots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Programnullhandle
 

Destacado (8)

"Archive What I See Now" - NEH ODH overview
"Archive What I See Now" - NEH ODH overview"Archive What I See Now" - NEH ODH overview
"Archive What I See Now" - NEH ODH overview
 
Who and What Links to the Internet Archive
Who and What Links to the Internet ArchiveWho and What Links to the Internet Archive
Who and What Links to the Internet Archive
 
What can linked data do for digital libraries
What can linked data do for digital librariesWhat can linked data do for digital libraries
What can linked data do for digital libraries
 
The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...The impact of innovation on travel and tourism industries (World Travel Marke...
The impact of innovation on travel and tourism industries (World Travel Marke...
 
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
 
Considerations for Strategic Web Archive Collection Development
Considerations for Strategic Web Archive Collection DevelopmentConsiderations for Strategic Web Archive Collection Development
Considerations for Strategic Web Archive Collection Development
 
Building Web Archiving Technology, Together
Building Web Archiving Technology, TogetherBuilding Web Archiving Technology, Together
Building Web Archiving Technology, Together
 
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS ProgramLots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
 

Similar a Measure All the (Web Archiving) Things!

AWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridge
AWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridgeAWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridge
AWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridgeAmazon Web Services
 
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...nullhandle
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012lljohnston
 
Web-Scale Discovery: Post Implementation
Web-Scale Discovery: Post ImplementationWeb-Scale Discovery: Post Implementation
Web-Scale Discovery: Post ImplementationRachel Vacek
 
Content & Features Reno: Less Is More
Content & Features Reno: Less Is MoreContent & Features Reno: Less Is More
Content & Features Reno: Less Is MoreCharlie Morris
 
Collection management in a digital age ola2011 revised
Collection management in a digital age ola2011 revisedCollection management in a digital age ola2011 revised
Collection management in a digital age ola2011 revisedTony Horava
 
Collection management in a digital age ola2011
Collection management in a digital age ola2011Collection management in a digital age ola2011
Collection management in a digital age ola2011Tony Horava
 
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner Vogels
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner VogelsAWS Enterprise Day | Closing Keynote, Singapore - Dr Werner Vogels
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner VogelsAmazon Web Services
 
Beyond the Fridge, The World of Connected Data - Dr Werner Vogels
Beyond the Fridge, The World of Connected Data - Dr Werner VogelsBeyond the Fridge, The World of Connected Data - Dr Werner Vogels
Beyond the Fridge, The World of Connected Data - Dr Werner VogelsAmazon Web Services
 
When Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes SearchWhen Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes SearchJaap Kamps
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014ALTER WAY
 
Big and Small Web Data
Big and Small Web DataBig and Small Web Data
Big and Small Web DataMarieke Guy
 
From Open Access to Open Data
From Open Access to Open DataFrom Open Access to Open Data
From Open Access to Open DataBrian Hole
 
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...Amazon Web Services
 
ISN Personal Dossiers - Leveraging online Libraries for Learning
ISN Personal Dossiers - Leveraging online Libraries for LearningISN Personal Dossiers - Leveraging online Libraries for Learning
ISN Personal Dossiers - Leveraging online Libraries for LearningChristian Glahn
 
Web and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of CongressWeb and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of Congressnullhandle
 
Cro presentation for library jan13v2
Cro presentation for library jan13v2Cro presentation for library jan13v2
Cro presentation for library jan13v2NeilStewartCity
 
CLA Digital Collection Development
CLA Digital Collection DevelopmentCLA Digital Collection Development
CLA Digital Collection Developmentuclagovinfolibrarian
 

Similar a Measure All the (Web Archiving) Things! (20)

AWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridge
AWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridgeAWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridge
AWS Paris Summit 2014 - Closing Keynote Werner Vogels - Beyond the fridge
 
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
 
Web-Scale Discovery: Post Implementation
Web-Scale Discovery: Post ImplementationWeb-Scale Discovery: Post Implementation
Web-Scale Discovery: Post Implementation
 
Cil06giltrud(1)
Cil06giltrud(1)Cil06giltrud(1)
Cil06giltrud(1)
 
Content & Features Reno: Less Is More
Content & Features Reno: Less Is MoreContent & Features Reno: Less Is More
Content & Features Reno: Less Is More
 
Collection management in a digital age ola2011 revised
Collection management in a digital age ola2011 revisedCollection management in a digital age ola2011 revised
Collection management in a digital age ola2011 revised
 
Collection management in a digital age ola2011
Collection management in a digital age ola2011Collection management in a digital age ola2011
Collection management in a digital age ola2011
 
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner Vogels
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner VogelsAWS Enterprise Day | Closing Keynote, Singapore - Dr Werner Vogels
AWS Enterprise Day | Closing Keynote, Singapore - Dr Werner Vogels
 
Beyond the Fridge, The World of Connected Data - Dr Werner Vogels
Beyond the Fridge, The World of Connected Data - Dr Werner VogelsBeyond the Fridge, The World of Connected Data - Dr Werner Vogels
Beyond the Fridge, The World of Connected Data - Dr Werner Vogels
 
When Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes SearchWhen Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes Search
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
 
Big and Small Web Data
Big and Small Web DataBig and Small Web Data
Big and Small Web Data
 
From Open Access to Open Data
From Open Access to Open DataFrom Open Access to Open Data
From Open Access to Open Data
 
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...
AWS Summit Sydney 2014 | Closing Keynote - Dr Werner Vogels, VP & CTO, Amazon...
 
ISN Personal Dossiers - Leveraging online Libraries for Learning
ISN Personal Dossiers - Leveraging online Libraries for LearningISN Personal Dossiers - Leveraging online Libraries for Learning
ISN Personal Dossiers - Leveraging online Libraries for Learning
 
Web and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of CongressWeb and Twitter Archiving at the Library of Congress
Web and Twitter Archiving at the Library of Congress
 
Cro presentation for library jan13v2
Cro presentation for library jan13v2Cro presentation for library jan13v2
Cro presentation for library jan13v2
 
Measuring impact
Measuring impactMeasuring impact
Measuring impact
 
CLA Digital Collection Development
CLA Digital Collection DevelopmentCLA Digital Collection Development
CLA Digital Collection Development
 

Más de nullhandle

Understanding Legal Use Cases for Web Archives
Understanding Legal Use Cases for Web ArchivesUnderstanding Legal Use Cases for Web Archives
Understanding Legal Use Cases for Web Archivesnullhandle
 
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...nullhandle
 
Unlocking LOCKSS with APIs
Unlocking LOCKSS with APIsUnlocking LOCKSS with APIs
Unlocking LOCKSS with APIsnullhandle
 
Interoperability and Technical Collaboration for Web and Social Media Archiving
Interoperability and Technical Collaboration for Web and Social Media ArchivingInteroperability and Technical Collaboration for Web and Social Media Archiving
Interoperability and Technical Collaboration for Web and Social Media Archivingnullhandle
 
Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...
Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...
Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...nullhandle
 
2015 NDSA Web Archiving Survey Report Highlights
2015 NDSA Web Archiving Survey Report Highlights2015 NDSA Web Archiving Survey Report Highlights
2015 NDSA Web Archiving Survey Report Highlightsnullhandle
 
Collection Development for Selective Web Archiving
Collection Development for Selective Web ArchivingCollection Development for Selective Web Archiving
Collection Development for Selective Web Archivingnullhandle
 
Why Not Lots of Copies Keep(ing) Software Safe?
Why Not Lots of Copies Keep(ing) Software Safe?Why Not Lots of Copies Keep(ing) Software Safe?
Why Not Lots of Copies Keep(ing) Software Safe?nullhandle
 
WASAPI Web Archive Data Transfer APIs
WASAPI Web Archive Data Transfer APIsWASAPI Web Archive Data Transfer APIs
WASAPI Web Archive Data Transfer APIsnullhandle
 
Outreach to Campus Webmasters for a Better Web, and Better Web Archiving
Outreach to Campus Webmasters for a Better Web, and Better Web ArchivingOutreach to Campus Webmasters for a Better Web, and Better Web Archiving
Outreach to Campus Webmasters for a Better Web, and Better Web Archivingnullhandle
 
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...nullhandle
 
Campaign Web Archives to Support Multi-Institutional Research
Campaign Web Archives to Support Multi-Institutional ResearchCampaign Web Archives to Support Multi-Institutional Research
Campaign Web Archives to Support Multi-Institutional Researchnullhandle
 
2013 NDSA Web Archiving Survey Report Highlights
2013 NDSA Web Archiving Survey Report Highlights2013 NDSA Web Archiving Survey Report Highlights
2013 NDSA Web Archiving Survey Report Highlightsnullhandle
 
Advocating for Web Archivability
Advocating for Web ArchivabilityAdvocating for Web Archivability
Advocating for Web Archivabilitynullhandle
 
Building Archivable Websites
Building Archivable WebsitesBuilding Archivable Websites
Building Archivable Websitesnullhandle
 
Link Persistence, Website Persistence
Link Persistence, Website PersistenceLink Persistence, Website Persistence
Link Persistence, Website Persistencenullhandle
 
From Seed to Harvest: Web Archiving Program Considerations for SUL
From Seed to Harvest: Web Archiving Program Considerations for SULFrom Seed to Harvest: Web Archiving Program Considerations for SUL
From Seed to Harvest: Web Archiving Program Considerations for SULnullhandle
 
Tool Academy: Web Archiving
Tool Academy: Web ArchivingTool Academy: Web Archiving
Tool Academy: Web Archivingnullhandle
 
Using Wayback Machine for Research
Using Wayback Machine for ResearchUsing Wayback Machine for Research
Using Wayback Machine for Researchnullhandle
 
Designing Preservable Websites
Designing Preservable WebsitesDesigning Preservable Websites
Designing Preservable Websitesnullhandle
 

Más de nullhandle (20)

Understanding Legal Use Cases for Web Archives
Understanding Legal Use Cases for Web ArchivesUnderstanding Legal Use Cases for Web Archives
Understanding Legal Use Cases for Web Archives
 
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
 
Unlocking LOCKSS with APIs
Unlocking LOCKSS with APIsUnlocking LOCKSS with APIs
Unlocking LOCKSS with APIs
 
Interoperability and Technical Collaboration for Web and Social Media Archiving
Interoperability and Technical Collaboration for Web and Social Media ArchivingInteroperability and Technical Collaboration for Web and Social Media Archiving
Interoperability and Technical Collaboration for Web and Social Media Archiving
 
Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...
Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...
Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...
 
2015 NDSA Web Archiving Survey Report Highlights
2015 NDSA Web Archiving Survey Report Highlights2015 NDSA Web Archiving Survey Report Highlights
2015 NDSA Web Archiving Survey Report Highlights
 
Collection Development for Selective Web Archiving
Collection Development for Selective Web ArchivingCollection Development for Selective Web Archiving
Collection Development for Selective Web Archiving
 
Why Not Lots of Copies Keep(ing) Software Safe?
Why Not Lots of Copies Keep(ing) Software Safe?Why Not Lots of Copies Keep(ing) Software Safe?
Why Not Lots of Copies Keep(ing) Software Safe?
 
WASAPI Web Archive Data Transfer APIs
WASAPI Web Archive Data Transfer APIsWASAPI Web Archive Data Transfer APIs
WASAPI Web Archive Data Transfer APIs
 
Outreach to Campus Webmasters for a Better Web, and Better Web Archiving
Outreach to Campus Webmasters for a Better Web, and Better Web ArchivingOutreach to Campus Webmasters for a Better Web, and Better Web Archiving
Outreach to Campus Webmasters for a Better Web, and Better Web Archiving
 
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
 
Campaign Web Archives to Support Multi-Institutional Research
Campaign Web Archives to Support Multi-Institutional ResearchCampaign Web Archives to Support Multi-Institutional Research
Campaign Web Archives to Support Multi-Institutional Research
 
2013 NDSA Web Archiving Survey Report Highlights
2013 NDSA Web Archiving Survey Report Highlights2013 NDSA Web Archiving Survey Report Highlights
2013 NDSA Web Archiving Survey Report Highlights
 
Advocating for Web Archivability
Advocating for Web ArchivabilityAdvocating for Web Archivability
Advocating for Web Archivability
 
Building Archivable Websites
Building Archivable WebsitesBuilding Archivable Websites
Building Archivable Websites
 
Link Persistence, Website Persistence
Link Persistence, Website PersistenceLink Persistence, Website Persistence
Link Persistence, Website Persistence
 
From Seed to Harvest: Web Archiving Program Considerations for SUL
From Seed to Harvest: Web Archiving Program Considerations for SULFrom Seed to Harvest: Web Archiving Program Considerations for SUL
From Seed to Harvest: Web Archiving Program Considerations for SUL
 
Tool Academy: Web Archiving
Tool Academy: Web ArchivingTool Academy: Web Archiving
Tool Academy: Web Archiving
 
Using Wayback Machine for Research
Using Wayback Machine for ResearchUsing Wayback Machine for Research
Using Wayback Machine for Research
 
Designing Preservable Websites
Designing Preservable WebsitesDesigning Preservable Websites
Designing Preservable Websites
 

Último

Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime BalliaBallia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Balliameghakumariji156
 
Call girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsCall girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsMonica Sydney
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdfMatthew Sinclair
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirtrahman018755
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC
 
一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理F
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsMonica Sydney
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfJOHNBEBONYAP1
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制pxcywzqs
 
一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理F
 
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiAbu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiMonica Sydney
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsMonica Sydney
 
Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.krishnachandrapal52
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasDigicorns Technologies
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样ayvbos
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"growthgrids
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdfMatthew Sinclair
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdfMatthew Sinclair
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrHenryBriggs2
 

Último (20)

Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime BalliaBallia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
 
Call girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsCall girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girls
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
 
一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
 
一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理
 
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiAbu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
 
Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency Dallas
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
 

Measure All the (Web Archiving) Things!

  • 1. Measure All the (Web Archiving) Things! Nicholas Taylor Web Archiving Service Manager Stanford University Libraries Archive-It Partner Meeting August 18, 2015
  • 2. how many more websites are we archiving? “Library_01.jpg” by British Library
  • 3. crawl report list Archive-It: “Crawls for Account #198”
  • 4. seeds for individual crawl Archive-It: “Seeds for Crawl #99435”
  • 5. download seed list Archive-It: “Seeds for Crawl #99435”
  • 8. oh, wait a minute… seed lists are per crawl well, how many crawls are there? • 6 accounts • oldest active since 2007 • 30+ collections • hundreds of crawls
  • 9. count and average not enough • seeds move in and out of crawls • seeds have different frequencies • new seeds w/ new URLs for old seeds • “university website” is many seeds plus • non Archive-It web archiving activity “Dichotomic Maples” by francoismi under CC BY-NC-SA 2.0
  • 10. “what gets measured, gets managed” “Gudauri still life” by Carsten ten Brink under CC BY-NC-ND 2.0
  • 11. why measure? • advocacy/outreach • service modeling • program assessment • policy making • staffing assessment • grant support • prioritization • risk assessment “Measuring river depth” by epeirogenic under CC BY-NC 2.0
  • 12. what to measure? • How to handle the data volume? • What is the usage of web archives? • How much does web archiving cost? • How to assure the quality of archived content? • How to secure institutional buy-in? • How much loss have resources suffered? • What is the impact of policy requirements?
  • 13. community-valued metrics 0% 10% 20% 30% 40% 50% 60% Volume Usage Cost Quality Buy-in Loss Policy Percentage of organizations NDSA: “Web Archiving in the United States: a 2013 Survey”
  • 14. volume • websites – captured – preserved – described • data – captured – preserved • objects – captured – preserved “typography jumble” by Bill Dickinson under CC BY-NC 2.0
  • 15. usage • web analytics – visitors – visits – referers • actual use cases (who + how many?) – research – teaching – institutional legacy – compliance “113/365 Days: A page from my heart” by LaughingRhoda under CC BY-NC-ND 2.0
  • 16. cost • external – out-payments for web archiving services – quota utilization • internal – staff time, by activity – storage “Largest square from a dollar bill” by origami_madness under CC BY-NC 2.0
  • 17. performance • accessioning throughput • service request turnaround • collections/websites w/ discovery records • time to regenerate full-text index “Lower rack” by Andy Melton under CC BY-SA 2.0
  • 18. community-valued…metrics? 0% 10% 20% 30% 40% 50% 60% Volume Usage Cost Quality Buy-in Loss Policy Percentage of organizations NDSA: “Web Archiving in the United States: a 2013 Survey”
  • 19. “not everything that counts can be counted” “Ten Floods, Twenty-Five Trees, Nineteen Bubbles...” by Flood G. under CC BY-NC-ND 2.0
  • 20. quality • use case-specific? • benchmark to ideal or to limits of tools? • quantifiable metrics? • existing metrics as proxies for quality? • sampling approach? • not just missing content but also collected junk NYARC: “I. Introduction - NYARC Documentation”
  • 21. buy-in • unique nominators? • projects w/ web archiving component? • budgetary commitments? • resource commitments? • charge for service? • testimonials? “The Play” by Ryan Hyde under CC BY-SA 2.0
  • 22. loss UK Web Archive: “Ten years of the UK Web Archive: What have we saved?”
  • 23. policy • first capture under embargo • opt-out requests • takedown requests • external environment “We apologise for any convenience - Update” by Alan Stanton under CC BY-SA 2.0
  • 24. better measures, measuring better “Line Art Project #2 VIS3 UCSD” by Mandy Jouan under CC BY-NC-ND 2.0