This presentation describes the EU-funded project SCAPE – Scalable Preservation Environments –, its developments and sustainability plans.
The SCAPE project has developed scalable services for planning and execution of institutional preservation strategies on an open source platform that orchestrates semi-automated workflows for large-scale, heterogeneous collections of complex digital objects.
The project run-time was around 3½ years from 2011 to 2014.
Read more about SCAPE at www.scape-project.eu
2. • Your collection of digital data is growing rapidly.
• Your preservation activities must become more
efficient and more scalable.
• You need SCAPE!
• The SCAPE project has developed scalable solutions
for long-term preservation of large-scale and
heterogeneous data sets.
2
Digital Preservation – What do I need?
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
3. 3
What is SCAPE?
Its all about scalability!
• Scalable services for planning and execution of
institutional preservation strategies
• Infrastructure for the execution of digital
preservation processes on large volumes of data
• Existing tools have been improved and extended.
• New tools have been developed where necessary.
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
4. 4
What is SCAPE?
SCAPE covers a whole digital preservation life cycle
• Interconnecting services support
the preservation of large
repositories of digital objects
• Applications support the
formulation of preservation
policies, decision making and
selection of preservation actions
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
5. 5
What is SCAPE?
Take your pick – choose what you need!
• Use the full set of interconnected
SCAPE components or a selected
series of SCAPE tools or workflows.
• Many SCAPE components can be
individually incorporated.
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
6. • All SCAPE solutions arise from real-world challenges at
partner institutions.
• Each challenge is tested in testbeds at the partner
institutions.
6
Solutions Tested in Real Life
Web
Content
Digital
Repositories
This work was partially supported by the SCAPE Project.
Data
Centres
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
Research
Data Sets
Testbeds
7. Solutions for Content Holders
Scalability
In four dimensions:
Heterogeneity of collections
as well as number, size and
complexity of objects
Automation
Through scalable,
automated and simple to
design preservation
workflows
Planning
Answering core
preservation planning
questions
Integration
Through a robust,
integrated, open source
preservation system
7 This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
8. 8
Overview: SCAPE Architecture
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
9. 9
Overview: SCAPE Components
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
The SCAPE
Platform is a
reference
architecture
for scalable
preservation
environments
10. 10
Overview: SCAPE Components
This work was partially supported by the SCAPE Project.
The SCAPE Preservation
Components are tools which
enhance the functionality of a
digital preservation system in:
• Scalability
• Functional coverage
• Quality
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
11. The SCAPE Planning and Watch
components address the
bottleneck of decision
processes and processing
information required for
decision making
11
Overview: SCAPE Components
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
12. Examples of tools and services
12 This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
13. 13
Scalable Planning and Watch
Scout – an Automated Preservation Watch System
• Enables you to monitor your
collections
• Lets you access
community knowledge
• Collects relevant knowledge
and enables automated
notification
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
14. 14
Scalable Planning and Watch
C3PO – Content Profiling Tool for Preservation Analysis
• Analyses characterisation
metadata for digital collections
• Aggregates and combines the
metadata information across
collections
• Generates a profile of the
content set
• Allows use of different
metadata formats
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
15. 15
Scalable Planning and Watch
Plato – Scalable Preservation Planning
• Decision-making support tool
• Guides you through the
preservation planning
workflow
• Provides trust through
controlled experiments and
documentation
• Provides an executable plan
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
16. 16
Scalable Tools
ToMaR – let your Preservation Tools Scale
• Run existing tools against
large amounts of files
• Execute tools in a scalable
fashion on a MapReduce
cluster
• Enable scalable workflows
which chain together a set of
tools
• Process payloads too big to be
computed on a single machine
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
17. 17
Preservation Components
Pagelyzer – Monitor your Web Content
• Detect changes in web pages
• Compare web page versions
on a large scale
• Compare web page rendering
in different browsers
• Determine appropriate
frequency of web harvestings
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
18. 18
Preservation Components
Jpylyzer – Easy Validation of JPEG 2000
• Automated JP2 validation and
feature extraction
• Enables you to confirm
whether an image is a valid,
intact JP2 file
• Reports the key technical
properties of the image
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
19. 19
Preservation Components
Matchbox – easy Detection of Nearly Duplicate Images
• Identify duplicate content,
even where files are of
different size, format,
cropping etc. or scanned from
different original copy
• Automate quality assurance
and reduce manual effort
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
20. 20
Preservation Components
xcorrSound – Automate Sound Wave Analysis
• Compare two audio files and
output the similarity
• Detect overlaps in audio files
• Detect occurrences of a
smaller audio file (e.g. a jingle)
within a larger audio file or an
index of audio files
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
21. SCAPE tools are published as open source software.
Tools and services from SCAPE are sustained by
• Open Planets Foundation -
address core digital preservation
challenges and engage with the community
• COPTR -
Community Owned digital Preservation
Tool Registry
21
Sustainability of Tools and Services
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
22. Sustainability of SCAPE results
Ultimate Sustainability goal:
• Supporting communities of practice by enabling
efficient collaboration during the project and
beyond.
Open Planets Foundation will take post-project
ownership of the outputs, supported by other
partners providing specific capabilities.
22 This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
23. Sustainability of SCAPE results
Five complementary approaches:
• Visibility
Providing integrated outreach to multiple audiences to maximise
discoverability.
• Quality
Ensuring that project outputs conform to standards-driven quality
assurance.
• Training
Supporting skills development to further institutional capacity building.
• Open licensing
Using open licences to encourage the adoption and reuse of project
outputs.
• Community integration
Integrating project outputs into commercial and non-commercial
systems and services.
23 This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
24. • EU-funded project under FP7 (Research and
Technological Development)
• Project runtime: February 2011 to September 2014
• 20 partners from 10 countries - from memory
institutions, data centres, research labs, universities,
and industrial firms
• Public Project materials are licensed under a
CC-BY-SA International License
24
About SCAPE
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
25. 25
SCAPE Consortium
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
26. 26
Additional Sources of Interest
• Development Infrastructure
• Code repository hosted by the Open Planets Foundation and GitHub
• https://github.com/openplanets/scape/
• Development Wiki
• http://wiki.opf-labs.org/display/SP/Home
This work was partially supported by the SCAPE Project.
The SCAPE project is co‐funded by the European Union under FP7 ICT‐2009.4.1 (Grant Agreement number 270137).
• Tools
• http://www.scape-project.eu/tools
• Experimental Workflows
• http://www.myexperiment.org/search?query=SCAPE&type=all&commit=Search
• Publications
• http://www.scape-project.eu/category/publication
• Public Deliverables
• http://www.scape-project.eu/category/deliverable