Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
Can we save the web?
WEB ARCHIVING
Vangelis Banos
http://vbanos.gr/
Unconference, 9-10 Δεκεμβρίου 2013
Can we save the web?
• What do you mean?
• What is web archiving;

• The practical use of web archives.
• Making your own ...
What is the World Wide Web?

A huge collection of digital documents (websites) which are
stored on special computers (web ...
What is the World Wide Web?
What is the World Wide Web?
What is the World Wide Web?
What is on the web?

What isn’t on
the web?
Why save the web?
1. More and more items are born digital only material!
2. Some websites contain unique data and valuable...
Isn’t the web already safe?
• The answer is: NOT really!
• Websites are in danger:
– Organisations that maintain them stop...
A major blog hosting company was shut down
by the U.S. Authorities
Yahoo GEOCITIES has closed.
Natural disasters cause data center problems
Websites are tampered all the time
Websites are tampered all the time
Does this sound familiar?
Can we save the web?
• What do you mean?
• What is web archiving;

• The practical use of web archives.
• Making your own ...
Websites are tampered all the time
Web Archiving

The Internet
Archive has
backups

MTSR 2013, 22 Nov 2013, Thessaloniki

18
WEB ARCHIVING
The process of collecting portions of
the World Wide Web to ensure the
information is preserved in an
archiv...
Challenges
• How it is done technically?
• What should I choose to archive?
– The whole website? some pages? Some files on...
Archiving web pages is a technical challenge

File(s)

Software

Hardware

RECORD

Generic file archiving operation
Archiving web pages is a technical challenge
File(s)
File(s)
Software

File(s)
File(s)

Software

Hardware

File(s)
Softwa...
How it is done?

• Possible web archiving targets:
–
–
–
–

Government websites, Educational institutions,
People’s sugges...
Web archiving strategies
Who is working on web archiving?

Many important organisations work on
web archiving since 1996.
International Internet Preservation Consortium
• IIPC Members
–
–
–
–
–

National Libraries,
Academic Libraries,
Cultural ...
Obligation of the National Library
• According to UNESCO:
– «a national library is responsible for the
collection and stor...
Bibliothèque nationale de France
2006: legal deposit extended to
“signs, signals, writings,
images, sounds or messages of
...
Can we save the web?
• What do you mean?
• What is web archiving?

• The practical use of web archives.
• Making your own ...
Visiting the Internet Archive
• http://archive.org/
Internet Archive activities
• Key features, browsing, searching.
• Indicative web sites:
– Υπουργείο Παιδείας, 3 Jul 2010,...
Visiting Archive-It
• http://archive-it.org/
Archive-It activities
• Key features, browsing, searching, collections.
• Examples:
– Egypt Revolution and politics, Ameri...
Can we save the web?
• What do you mean?
• What is web archiving;

• The practical use of web archives.
• Making your own ...
HTTrack website copier
http://www.httrack.com
Making your own web archive
• Using HTTrack software (Open Source)
– Installation
– Practical advice
– Features
– Usage sc...
Things worth considering
• Set Limits
– Filters to define the file types you want to copy.
– Bandwidth limits & Connection...
Scenario: create your own mini web
archive in your library on a shoestring.
• Equipment:
– Typical Windows computer with t...
Can we save the web?

YES WE CAN!
• Questions?
• Thank you for your attention 
• Contact:
– Web: http://vbanos.gr
– Email...
Próxima SlideShare
Cargando en…5
×
Próxima SlideShare
Υπερδιαύγεια - Αναζήτηση στα δημόσια δεδομένα
Siguiente
Descargar para leer sin conexión y ver en pantalla completa.

2

Compartir

Descargar para leer sin conexión

Can you save the web? Web Archiving!

Descargar para leer sin conexión

1. What do you mean?
2. What is web archiving?
3. The practical use of web archives.
4. Making your own web archive.

Audiolibros relacionados

Gratis con una prueba de 30 días de Scribd

Ver todo

Can you save the web? Web Archiving!

  1. 1. Can we save the web? WEB ARCHIVING Vangelis Banos http://vbanos.gr/ Unconference, 9-10 Δεκεμβρίου 2013
  2. 2. Can we save the web? • What do you mean? • What is web archiving; • The practical use of web archives. • Making your own web archive.
  3. 3. What is the World Wide Web? A huge collection of digital documents (websites) which are stored on special computers (web servers), interconnected with each other.
  4. 4. What is the World Wide Web?
  5. 5. What is the World Wide Web?
  6. 6. What is the World Wide Web?
  7. 7. What is on the web? What isn’t on the web?
  8. 8. Why save the web? 1. More and more items are born digital only material! 2. Some websites contain unique data and valuable information. – Users take action and make important decisions based on this information. 3. The web is a live record of contemporary: 1. 2. 3. 4. Society, Culture, Science, Economy. 4. Responsibility to preserve the web. 5. Transparency is promoted by saving the web.
  9. 9. Isn’t the web already safe? • The answer is: NOT really! • Websites are in danger: – Organisations that maintain them stop caring about them, – Organisations than maintain them cease to exist, – Natural disasters destroy computer facilities (fires, floods, storms, etc) – Technical problems damage websites (bugs, computer viruses, backup failures, hardware failures) – Their data are tampered on purpose!!! for many reasons (political, financial, crime, etc)
  10. 10. A major blog hosting company was shut down by the U.S. Authorities
  11. 11. Yahoo GEOCITIES has closed.
  12. 12. Natural disasters cause data center problems
  13. 13. Websites are tampered all the time
  14. 14. Websites are tampered all the time
  15. 15. Does this sound familiar?
  16. 16. Can we save the web? • What do you mean? • What is web archiving; • The practical use of web archives. • Making your own web archive.
  17. 17. Websites are tampered all the time
  18. 18. Web Archiving The Internet Archive has backups MTSR 2013, 22 Nov 2013, Thessaloniki 18
  19. 19. WEB ARCHIVING The process of collecting portions of the World Wide Web to ensure the information is preserved in an archive for future researchers, historians, and the public.
  20. 20. Challenges • How it is done technically? • What should I choose to archive? – The whole website? some pages? Some files only? • What do I want to do with the web archive I’m creating? • Who will have access? • Who is the owner of the web archive content?
  21. 21. Archiving web pages is a technical challenge File(s) Software Hardware RECORD Generic file archiving operation
  22. 22. Archiving web pages is a technical challenge File(s) File(s) Software File(s) File(s) Software Hardware File(s) Software File(s) File(s) Web archiving operation Website
  23. 23. How it is done? • Possible web archiving targets: – – – – Government websites, Educational institutions, People’s suggestions, Currently popular websites, Popular media, Big companies, Special events
  24. 24. Web archiving strategies
  25. 25. Who is working on web archiving? Many important organisations work on web archiving since 1996.
  26. 26. International Internet Preservation Consortium • IIPC Members – – – – – National Libraries, Academic Libraries, Cultural Organisations, Universities, Software development companies • Web Archiving Timeline – http://timeline.webarchivists.org/
  27. 27. Obligation of the National Library • According to UNESCO: – «a national library is responsible for the collection and storage of the national cultural heritage». • In Greece, accoding to law No.3149/03: – «publishers or authors (when there is no publisher) of any printed material, are obliged to submit three copies of their work to the National Library of Greece. This obligation also includes audiovisual and epublishing material». • What about the Greek web?
  28. 28. Bibliothèque nationale de France 2006: legal deposit extended to “signs, signals, writings, images, sounds or messages of any kind communicated to the public by electronic means”. The goal is not to gather the «best of the web», but to preserve a collection representative of the web at a certain date.
  29. 29. Can we save the web? • What do you mean? • What is web archiving? • The practical use of web archives. • Making your own web archive.
  30. 30. Visiting the Internet Archive • http://archive.org/
  31. 31. Internet Archive activities • Key features, browsing, searching. • Indicative web sites: – Υπουργείο Παιδείας, 3 Jul 2010, www.minedu.gov.gr – Υπουργείο Ανάπτυξης, 21 Dec 2009 http://www.ypoian.gr/ – The White House, 7 Apr 2000, http://www.whitehouse.gov – BBC, 11 Sept 2001, http://www.bbc.co.uk/
  32. 32. Visiting Archive-It • http://archive-it.org/
  33. 33. Archive-It activities • Key features, browsing, searching, collections. • Examples: – Egypt Revolution and politics, American University in Cairo, – 2008 Beijing Olympic games, – Lybian Uprisings, University of Michigan, – Venice Biennale 2013
  34. 34. Can we save the web? • What do you mean? • What is web archiving; • The practical use of web archives. • Making your own web archive.
  35. 35. HTTrack website copier http://www.httrack.com
  36. 36. Making your own web archive • Using HTTrack software (Open Source) – Installation – Practical advice – Features – Usage scenarios • Archive http://2013.futurelibrary.gr/ • Archive http://www.auth.gr/
  37. 37. Things worth considering • Set Limits – Filters to define the file types you want to copy. – Bandwidth limits & Connection limits to avoid overloading the site you are archiving AND avoid saturating your library network. – Time limits • Check the size of the files you have downloaded. • Plan for disk space according to your needs. • Check target website copyrights. Are you allowed to: – Archive for personal use? – Archive for public use in library computers? – Archive to publish on the web? • If you are not sure, please ask the website owner before beginning web archiving.
  38. 38. Scenario: create your own mini web archive in your library on a shoestring. • Equipment: – Typical Windows computer with the biggest possible hard disk. (The more ΤΒ, the better). – Equal backup disk (e.g. External USB hard disk). – DSL Internet connection. – HTTRACK open source software • Select important local websites. • Get permissions from website owners if necessary. • Setup a regular web archiving schedule (e.g. Once per month). • Provide information and access to the web archive in your library’s local computers for the public.
  39. 39. Can we save the web? YES WE CAN! • Questions? • Thank you for your attention  • Contact: – Web: http://vbanos.gr – Email: vbanos@gmail.com – Twitter: @vbanos
  • banioras

    Jan. 16, 2014
  • EviLazaridou

    Jan. 16, 2014

1. What do you mean? 2. What is web archiving? 3. The practical use of web archives. 4. Making your own web archive.

Vistas

Total de vistas

3.124

En Slideshare

0

De embebidos

0

Número de embebidos

1.492

Acciones

Descargas

18

Compartidos

0

Comentarios

0

Me gusta

2

×