AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
Robust Linking to Web Resources
1. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
Robust Linking to Web Resources
http://robustlinks.mementoweb.org/
Martin Klein
@mart1nkle1n
Research Library
Los Alamos National Laboratory
Acknowledgements:
Herbert Van de Sompel, LANL
Harihar Shankar, LANL
Michael L. Nelson, ODU
Mark Graham, Internet Archive
2. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
2
Slide by Herbert Van de Sompel, 2017
A Managed Collection Desires Reliable Outlinks
3. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
3
Slide by Herbert Van de Sompel, 2017
Links to another Managed Collection
4. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
4
Slide by Herbert Van de Sompel, 2017
Links to Web at Large Resources
5. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
5
Link Rot
6. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
6
https://web.archive.org/web/20140101072007/http://netpreserve.org/general-assembly/2013/overview
IIPC
2013
7. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
7
http://netpreserve.org/general-assembly/2013/overview
IIPC
today
8. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
8
Content Drift
9. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
9
https://web.archive.org/web/20161228184110/https://www.epa.gov/climatechange
EPA
12/2016
10. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
10
https://www.epa.gov/sites/production/files/signpost/cc.html
EPA
today
11. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
11
• On the web, all links are subject to reference rot
• Reference rot hinders our ability to follow links as they were
intended when they were put in place
• Link rot: a link stops working all together
• Content drift: The linked content changes over time and
may eventually no longer be representative of the
content that was originally linked
Problem
12. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
12
http://dx.doi.org/10.1371/journal.pone.0115253 http://dx.doi.org/10.1371/journal.pone.0167475
Reference Rot in Scholarly Communication
13. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
13
Link Rot in Scholarly Articles
14. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
14
Link Rot in Scholarly Articles
15. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
15
Reference Rot Over Time - arXiv
16. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
16
• On the web, all links are subject to reference rot
• Reference rot hinders our ability to follow links as they were
intended when they were put in place
• Link rot: a link stops working all together
• Content drift: The linked content changes over time and
may eventually no longer be representative of the
content that was originally linked
How can we:
1. Make links more robust?
2. Make them actionable for humans and machines?
Problem
17. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
17
Robust Links
18. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
18
Robust Links
1. Create a snapshot of referenced resources in a public web
archive
19. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
19
Why multiple archives? They aren’t magic web sites!
They’re just web sites.
If you used Mummify, you’re now left with a bunch of defunct, shortened links like:
https://mummify.it/XbmcMfE3
Slide by Michael L. Nelson, 2016
20. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
20
Robust Links
1. Create a snapshot of referenced resources in a publically available
web archive
2. Decorate links with:
• URI of archived snapshot
• datetime of archiving
• resource’s original URI
21. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
21
Link Decoration with Standard HTML
<a href="http://web.archive.org/web/20171108053054/http://sfgov.org/"
data-originalurl="http://sfgov.org/"
data-versiondate="2017-11-08">
City and County of San Francisco</a>
http://robustlinks.mementoweb.org/spec
22. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
22
Link Decoration via API
http://robustlinks.mementoweb.org/api/json/http://web.archive.org/web/20171108053054/http://sfgov.org/
• Submit URI of an archived
snapshot
• Retrieve Robust Links
HTML snippet
• Copy and paste into your
application
http://robustlinks.mementoweb.org/api/json/{URI-of-archived-snapshot}
23. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
23
Robust Links
1. Create a snapshot of referenced resources in a publically available
web archive
2. Decorate links with:
• URI of archived snapshot
• datetime of archiving
• resource’s original URI
Benefits:
• Can visit archived, immutable version of referenced resource
• Original URI & capture datetime allow finding versions in other
web archives
• Uniform, machine-actionable
24. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
24
Robust Links for Machines
1. JavaScript
2. Browser extensions
a. Memento for Chrome
b. IA Chrome Extension
25. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
25
Robust Links in Action - JavaScript
http://dx.doi.org/10.1045/november2015-vandesompel
26. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
26
Robust Links in Action - JavaScript
http://dx.doi.org/10.1045/november2015-vandesompel
27. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
27
Robust Links in Action – Memento for Chrome
https://chrome.google.com/webstore/detail/memento-time-travel/jgbfpjledahoajcppakbgilmojkaghgm
28. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
28
Robust Links in Action – Memento for Chrome
http://robustlinks.mementoweb.org/demo/uri_references.html
29. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
29
Robust Links in Action – IA Chrome Extension
https://chrome.google.com/webstore/detail/wayback-machine/fpnmgdkabkmnadcjpehmlllkndpkmiak
30. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
30
Robust Links in Action – IA Chrome Extension
31. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
31
Robust Links in Action – IA Chrome Extension
32. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
32
Robust Links in Action – IA Chrome Extension
33. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
33
Take-Aways
• Links on the web are subject to reference rot
• “Robustifying” them (manually or via API calls) can help alleviate the
problem
• Link decorations as proposed by Robust Links are
• based on HTML standards
• machine-actionable
• Organizations such as the Internet Archive, Wikipedia,
News Publishers can help with adoption
34. Robust Linking to Web Resources
@mart1nkle1n
DtMH 2017, 11/15/2017, San Francisco, CA
Robust Linking to Web Resources
http://robustlinks.mementoweb.org/
Martin Klein
@mart1nkle1n
Research Library
Los Alamos National Laboratory
Acknowledgements:
Herbert Van de Sompel, LANL
Harihar Shankar, LANL
Michael L. Nelson, ODU
Mark Graham, Internet Archive