SlideShare una empresa de Scribd logo
1 de 15
Repository Redux:
The Past, Present & Future of Fedora
Open Repositories 2013
Charlottetown, Prince Edward Island
Thursday, 11 July 2013
Tom Cramer
@tcramer
A (Funding) History of Fedora
• Original software created 2000–08 with
$2.4M from the Mellon Foundation
➪ +$500,000 from UVA Library
• Moore Grant: 2007-11, $4.9M
• Committers from 10 institutions
• 2009–present: DuraSpace’s
sponsorship program has provided
funding for a tech lead
The Success of Fedora
• Architecture is demonstrably
Flexible & Extensible
• Support for Durability
• One foot in the linked data world
• A decade of maturity & proven use
• Substantial community of adopters,
contributors, vendors
Looming Storm Clouds
New Opportunities
• Front-ends: eSciDoc, Hydra,
Islandora
– Attracting new energy and adopters
– Creating new technical demands
• Evolved technical environment
– Web architectures & horizontal scaling
– Linked data
• Data management mandates
Fedora Futures Takes Shape
• OR12: Ad hoc meeting
• September ‘12:
– Meet to compare needs and notes
– Charter a 3 month investigation
• December ‘12:
– Commit to a three year project
– Announce, Invite and Launch
Fedora Futures Objectives
• Preserve the strengths of the
architecture and community
• Address the needs for robust and full-
featured repository services (that we
now understand very well)
• Provide a platform in the repository
ecosystem for the next 5-10 years
Technical Requirements
• Highly scalable
• High availability
• Higher performance
• Flexible storage
• Robust auditing, reporting and metrics
• Enhanced fixity and versioning
Requirements, continued
• Work for small, medium and large
institutions
– Easy to deploy, administer
• Support breadth of needs
– Traditional IR
– Heterogeneous content (e.g., media)
– Emerging data management needs
• Interoperate with other systems
– Lean core, modular, APIs
Organizational Requirements
• Revitalized corps of developers
• Robust community investment and
governance
• Bigger community base
– Geographic
– Commercial & Non-Profit
– Additional domains
Fedora Futures Delivers
• January – Feb ‘13
– Evaluate platforms for Alpha
• March – June ‘13
– Alpha v1 development
• July ‘13
– Alpha v1 released!
• 2nd
half of ‘13: Beta development
Fedora 4 Alpha 1 Highlights
• Roughly 80% of Fedora 3.x functionality
– in 7% of the lines of code
– with 72% test coverage (vs. 10% for Fedora 3.x)
• Clustering
• Batch operations
• Transaction support
• Policy-driven & projected storage
• Self-healing
• One step install…
Who Has Contributed So Far…
• Financial
contributions
• Developer
contributions
• Use cases &
priorities
• Integration &
testing
• Advocacy &
evangelism
We Need You…
1. Donate money
http://duraspace.org/sponsors
2. Add a developer
contact awoods@duraspace.org
3. Join the email list
ff-tech@googlegroups.com
4. Install the alpha
github.com/futures/fcrepo4/
5. Chime in!
Give use cases, feedback
Next Steps

Más contenido relacionado

Destacado

Destacado (8)

Fedora Update at CNI 2013 Fall Meeting
Fedora Update at CNI 2013 Fall MeetingFedora Update at CNI 2013 Fall Meeting
Fedora Update at CNI 2013 Fall Meeting
 
State of the HydraSphere from Hydra Connect 3 (Sept 2015)
State of the HydraSphere  from Hydra Connect 3 (Sept 2015)State of the HydraSphere  from Hydra Connect 3 (Sept 2015)
State of the HydraSphere from Hydra Connect 3 (Sept 2015)
 
Hydra for CNI Spring 2014 Meeting
Hydra for CNI Spring 2014 MeetingHydra for CNI Spring 2014 Meeting
Hydra for CNI Spring 2014 Meeting
 
Digital Manuscript Interoperability Via Shared Canvas
Digital Manuscript Interoperability Via Shared CanvasDigital Manuscript Interoperability Via Shared Canvas
Digital Manuscript Interoperability Via Shared Canvas
 
IIIF for CNI Spring 2014 Membership Meeting
IIIF for CNI Spring 2014 Membership MeetingIIIF for CNI Spring 2014 Membership Meeting
IIIF for CNI Spring 2014 Membership Meeting
 
First Ever Hydra Awards -- presented at OR13
First Ever Hydra Awards -- presented at OR13First Ever Hydra Awards -- presented at OR13
First Ever Hydra Awards -- presented at OR13
 
IIIF: Shared Canvas 2.0
IIIF: Shared Canvas 2.0IIIF: Shared Canvas 2.0
IIIF: Shared Canvas 2.0
 
IIPC General Assembly 2016 - Tool Development Portfolio
IIPC General Assembly 2016 - Tool Development PortfolioIIPC General Assembly 2016 - Tool Development Portfolio
IIPC General Assembly 2016 - Tool Development Portfolio
 

Similar a Fedora Futures for OR13

key research challenges in cloud computing
key research challenges in cloud computingkey research challenges in cloud computing
key research challenges in cloud computing
Ignacio M. Llorente
 
SGCI Science Gateways: Ushering in a New Era of Sustainability
SGCI Science Gateways: Ushering in a New Era of Sustainability SGCI Science Gateways: Ushering in a New Era of Sustainability
SGCI Science Gateways: Ushering in a New Era of Sustainability
Sandra Gesing
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
SEAD
 

Similar a Fedora Futures for OR13 (20)

Integrating an electronic lab notebook with a data repository; American Chemi...
Integrating an electronic lab notebook with a data repository; American Chemi...Integrating an electronic lab notebook with a data repository; American Chemi...
Integrating an electronic lab notebook with a data repository; American Chemi...
 
Elns and repositories, American Chemical Society, Dallas, March 2014
Elns and repositories, American Chemical Society, Dallas, March 2014Elns and repositories, American Chemical Society, Dallas, March 2014
Elns and repositories, American Chemical Society, Dallas, March 2014
 
Fedora Futures - CNI 2012
Fedora Futures - CNI 2012Fedora Futures - CNI 2012
Fedora Futures - CNI 2012
 
3-27-12 Preservation & Archiving Highlights from ADR - Presentation Slides
3-27-12 Preservation & Archiving Highlights from ADR - Presentation Slides3-27-12 Preservation & Archiving Highlights from ADR - Presentation Slides
3-27-12 Preservation & Archiving Highlights from ADR - Presentation Slides
 
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
10-15-13 “Metadata and Repository Services for Research Data Curation” Presen...
 
key research challenges in cloud computing
key research challenges in cloud computingkey research challenges in cloud computing
key research challenges in cloud computing
 
DataShare for UC Campuses
DataShare for UC CampusesDataShare for UC Campuses
DataShare for UC Campuses
 
Get A Head on Your Repository
Get A Head on Your RepositoryGet A Head on Your Repository
Get A Head on Your Repository
 
SGCI Science Gateways: Ushering in a New Era of Sustainability
SGCI Science Gateways: Ushering in a New Era of Sustainability SGCI Science Gateways: Ushering in a New Era of Sustainability
SGCI Science Gateways: Ushering in a New Era of Sustainability
 
e-infrastructural needs to support informatics
e-infrastructural needs to support informaticse-infrastructural needs to support informatics
e-infrastructural needs to support informatics
 
Xsede for-nlhpc
Xsede for-nlhpcXsede for-nlhpc
Xsede for-nlhpc
 
What Do Records Managers Need to Know About Open Source, Open Standards, Open...
What Do Records Managers Need to Know About Open Source, Open Standards, Open...What Do Records Managers Need to Know About Open Source, Open Standards, Open...
What Do Records Managers Need to Know About Open Source, Open Standards, Open...
 
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...
 
A Data Scientist Perspective on Data Curation in the Digital Era
A Data Scientist Perspective on Data Curation in the Digital EraA Data Scientist Perspective on Data Curation in the Digital Era
A Data Scientist Perspective on Data Curation in the Digital Era
 
Cloud Presentation and OpenStack case studies -- Harvard University
Cloud Presentation and OpenStack case studies -- Harvard UniversityCloud Presentation and OpenStack case studies -- Harvard University
Cloud Presentation and OpenStack case studies -- Harvard University
 
Fast, Flexible Application Development with Oracle Database Cloud Service
Fast, Flexible Application Development with Oracle Database Cloud ServiceFast, Flexible Application Development with Oracle Database Cloud Service
Fast, Flexible Application Development with Oracle Database Cloud Service
 
Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...
 
Digital curation through shared endeavour - IDCC 2015
Digital curation through shared endeavour - IDCC 2015Digital curation through shared endeavour - IDCC 2015
Digital curation through shared endeavour - IDCC 2015
 
XSEDE and National Cyberinfrastructure
XSEDE and National CyberinfrastructureXSEDE and National Cyberinfrastructure
XSEDE and National Cyberinfrastructure
 
ORION Workshop: XSEDE and Building a National/International Cyberinfrastructure
ORION Workshop: XSEDE and Building a National/International CyberinfrastructureORION Workshop: XSEDE and Building a National/International Cyberinfrastructure
ORION Workshop: XSEDE and Building a National/International Cyberinfrastructure
 

Último

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

Fedora Futures for OR13

  • 1. Repository Redux: The Past, Present & Future of Fedora Open Repositories 2013 Charlottetown, Prince Edward Island Thursday, 11 July 2013 Tom Cramer @tcramer
  • 2. A (Funding) History of Fedora • Original software created 2000–08 with $2.4M from the Mellon Foundation ➪ +$500,000 from UVA Library • Moore Grant: 2007-11, $4.9M • Committers from 10 institutions • 2009–present: DuraSpace’s sponsorship program has provided funding for a tech lead
  • 3. The Success of Fedora • Architecture is demonstrably Flexible & Extensible • Support for Durability • One foot in the linked data world • A decade of maturity & proven use • Substantial community of adopters, contributors, vendors
  • 5. New Opportunities • Front-ends: eSciDoc, Hydra, Islandora – Attracting new energy and adopters – Creating new technical demands • Evolved technical environment – Web architectures & horizontal scaling – Linked data • Data management mandates
  • 6. Fedora Futures Takes Shape • OR12: Ad hoc meeting • September ‘12: – Meet to compare needs and notes – Charter a 3 month investigation • December ‘12: – Commit to a three year project – Announce, Invite and Launch
  • 7. Fedora Futures Objectives • Preserve the strengths of the architecture and community • Address the needs for robust and full- featured repository services (that we now understand very well) • Provide a platform in the repository ecosystem for the next 5-10 years
  • 8. Technical Requirements • Highly scalable • High availability • Higher performance • Flexible storage • Robust auditing, reporting and metrics • Enhanced fixity and versioning
  • 9. Requirements, continued • Work for small, medium and large institutions – Easy to deploy, administer • Support breadth of needs – Traditional IR – Heterogeneous content (e.g., media) – Emerging data management needs • Interoperate with other systems – Lean core, modular, APIs
  • 10. Organizational Requirements • Revitalized corps of developers • Robust community investment and governance • Bigger community base – Geographic – Commercial & Non-Profit – Additional domains
  • 11. Fedora Futures Delivers • January – Feb ‘13 – Evaluate platforms for Alpha • March – June ‘13 – Alpha v1 development • July ‘13 – Alpha v1 released! • 2nd half of ‘13: Beta development
  • 12. Fedora 4 Alpha 1 Highlights • Roughly 80% of Fedora 3.x functionality – in 7% of the lines of code – with 72% test coverage (vs. 10% for Fedora 3.x) • Clustering • Batch operations • Transaction support • Policy-driven & projected storage • Self-healing • One step install…
  • 13. Who Has Contributed So Far…
  • 14. • Financial contributions • Developer contributions • Use cases & priorities • Integration & testing • Advocacy & evangelism We Need You…
  • 15. 1. Donate money http://duraspace.org/sponsors 2. Add a developer contact awoods@duraspace.org 3. Join the email list ff-tech@googlegroups.com 4. Install the alpha github.com/futures/fcrepo4/ 5. Chime in! Give use cases, feedback Next Steps

Notas del editor

  1. Original software created 2000-2008 ○ $2.4 million from the Mellon Foundation ○ $500,000+ from UVA Library ● Codebase management and improvement has been continued by the committers group ○ Acuity Unlimited, Columbia, Cornell, DTU, FIZ Karlsruhe, the Denmark State Library, MediaShelf, UVa, U. of Wisconsin ○ Plus 3 independent software developers ● DuraSpace's sponsorship program has provided funding for the technical lead V1 = 2003 V2 = 2005 V3 = 2008
  2. 3 full versions over 12 years Hundreds of adopters worldwide Its own 501(c)3! With dozens of institutional sponsors Its own ecosystem (vendors, other systems) A shared international annual conference (OR) Demonstrated success as a Flexible, Extensible digital repository architecture
  3. Chinese sign for crisis = danger + opportunity
  4. Grass roots across bottom Duraspace relative to F and FF
  5. Grass roots across bottom Duraspace relative to F and FF
  6. One of the very early decision points for Fedora Futures was whether to pursue a) iterating on the existing Fedora 3.x codebase, b) a burn-it-all-down and build-it-anew greenfield project or c) build on top of or extend an existing platform For reasons of efficiency and risk management, we elected to build atop of an existing platform. We devoted two sprints to candidate evaluation, which also included developing a test harness to help us measure performance. By the end of Sprint 2, we had selected ModeShape, which is a JCR implementation from JBoss. By building on top of ModeShape: Rather than rebuilding from scratch (and therefore also taking the responsibility of maintaining a full stack) we avoid re-inventing the wheel and re-use some best-of-breed products and technologies so we can focus our always-limited resources on delivering a best-of-breed preservation repository service. Provides us a solid foundation for building a highly available, scalable, repository service Specifically, Fedora will finally be able to support transactions, clustering and offer higher performance with more flexible storage options As a result, we are on target to deliver (with varying degrees of completeness) on most if not all of the following objectives by mid-year (I’ve starred the objectives where development is already underway) =============== High scalability, high availability architecture Scale out (horizontally) to meet high volume access or ingest requirements Cluster configurations to avoid any single point of failure Flexible storage policy-driven storage support storage such as AWS Glacier to meet retention requirements, but reduce costs Durability Where Fedora 3 was preservation-enabling, Fedora 4 will be preservation-enabled One of the exciting features of Fedora 4 is really delivering on durability. In the current sprint, we're developing features to enable repository to be self-healing: that is detecting fixity failures and automatically restoring from a known-good redundant store Reporting and metrics More reporting and metrics to assist repository managers to make informed decisions Developer-friendliness Modular architecture Extensibility for non-Java developers (e.g. Ruby, Python & Scala) Ease of deployment Allow dev-ops and sysadmins to deploy Fedora easily and consistently across VMs or cloud infrastructure Better support for configuration management tools such as Puppet or Chef
  7. One of the very early decision points for Fedora Futures was whether to pursue a) iterating on the existing Fedora 3.x codebase, b) a burn-it-all-down and build-it-anew greenfield project or c) build on top of or extend an existing platform For reasons of efficiency and risk management, we elected to build atop of an existing platform. We devoted two sprints to candidate evaluation, which also included developing a test harness to help us measure performance. By the end of Sprint 2, we had selected ModeShape, which is a JCR implementation from JBoss. By building on top of ModeShape: Rather than rebuilding from scratch (and therefore also taking the responsibility of maintaining a full stack) we avoid re-inventing the wheel and re-use some best-of-breed products and technologies so we can focus our always-limited resources on delivering a best-of-breed preservation repository service. Provides us a solid foundation for building a highly available, scalable, repository service Specifically, Fedora will finally be able to support transactions, clustering and offer higher performance with more flexible storage options As a result, we are on target to deliver (with varying degrees of completeness) on most if not all of the following objectives by mid-year (I’ve starred the objectives where development is already underway) =============== High scalability, high availability architecture Scale out (horizontally) to meet high volume access or ingest requirements Cluster configurations to avoid any single point of failure Flexible storage policy-driven storage support storage such as AWS Glacier to meet retention requirements, but reduce costs Durability Where Fedora 3 was preservation-enabling, Fedora 4 will be preservation-enabled One of the exciting features of Fedora 4 is really delivering on durability. In the current sprint, we're developing features to enable repository to be self-healing: that is detecting fixity failures and automatically restoring from a known-good redundant store Reporting and metrics More reporting and metrics to assist repository managers to make informed decisions Developer-friendliness Modular architecture Extensibility for non-Java developers (e.g. Ruby, Python & Scala) Ease of deployment Allow dev-ops and sysadmins to deploy Fedora easily and consistently across VMs or cloud infrastructure Better support for configuration management tools such as Puppet or Chef
  8. Dev team with 11 developers from 8 institutions = 4 FTE’s, managed by Eddie Shin, working for 12 2-week sprints
  9. Grass roots across bottom Duraspace relative to F and FF
  10. Grass roots across bottom