1. The Data Management
Ecosystem
4 April 2013
University of California Curation Center
California Digital Library
2. The research data problem
• Journal article • Research data
– Uniquely and persistently – Nope
identified
– Concept of “publish” – Not really
– Multiple copies – Typically one
– Easily findable – Difficult
– Services: impact – Nope
metrics, citation
tracking, etc.
Research data is seen as a second-
class citizen in the scholarly record.
3. An ecosystem of inter-dependent partners
Besides data repository and publisher partners...
• researchers
• educators
• citizen science groups
• funders
• tenure and promotion committees
Libraries as neutral connection partners
4. Where can libraries make a difference?
Research & Scholarship Lifecycle
Research
Save Collect
Create
Knowledge
Share Publish
5. Collect > Publish > Share > Save > Research
Create, edit, share, and save data
management plans
Open source curation add-in for
Microsoft Excel
Capture today’s web; build
tomorrow’s archives
6. Collect >Publish > Share > Save > Research
Create and manage persistent
identifiers: ARKs, DOIs, etc.
An infrastructure to publish and get
credit for sharing research data
7. Collect > Publish >Share > Save > Research
Curation repository:
store, manage, preserve, and share
research data
Open deposit, open access
repository for spreadsheet data
Data Observation Network for Earth
8. Collect > Publish > Share > Save >Research
What’s missing to complete the “incentive” circuit?
• Impact measures, citation tracking
“Connecting the data to the
research it informs”
Altmetrics tools to measure non-
traditional products and uses , , etc.
9. Stable storage: Merritt repository
• Curation repository open to the UC
community and beyond
• Discipline / content agnostic
• Micro-services architecture
• Easy-to-use UI or API
• Hosted or locally deployed
10. EZID: Long term identifiers made easy
• Precise identification of a
dataset (DOI or ARK)
• Credit to data producers and
data publishers
• A link from the traditional
literature to the data (DataCite)
• Exposure and research metrics
for datasets
(Web of Knowledge, Google)
Take control of the
management and distribution
of your research, share and get
credit for it, and build your
reputation through its collection
and documentation
11. Discovery: DataCiteconsortium
• Technische Informationsbibliothek • Canada Institute for Scientific and
(TIB), Germany Technical Information (CISTI)
• L’Institut de l’Information Scientifique
• Australian National Data Service (ANDS)
et Technique (INIST), France
• The British Library
• Library or the ETH Zürich
• California Digital Library, USA • Library of TU Delft, The Netherlands
• Office of Scientific and Technical
Information, US Department of Energy
• Purdue University, USA
• Technical Information Center of
Denmark
12. New distributed framework
Coordinating Nodes Flexible, scalable, sustainabl
Member Nodes
• retain complete metadata
e network
• catalog institutions
diverse
• subset of all data
• serve local community
• perform basic indexing
• provide resources for
• provide network-wide
managing their data
services
• ensure data availability
(preservation)
• provide replication
services
13. The rest of the story
www.cdlib.org/uc3
John.Kunze@ucop.edu
uc3@ucop.edu for service questions
Notas del editor
Panel: Partnerships between institutional repositories, domain repositories, and publishers20-25 mins, 9:30-11amThe 'data management ecosystem' angle seems appropriate for the panel, but feel free to share some of the technical aspects with the audience, too.partnerships via conventions and APIs. Data Citation conventions, Libraries are chipping away on several fronts to try to shrink this "data curation" problem to a more manageable size, and they are offering a great deal of support for data management planning, data citation, identifier and repository services,repository federation, and “data publication”.
Research data can be seen to fit in a kind of ecosystem of inter-dependent stakeholder niches. Each niche depends on other niches.In a broad sense, partnerships are about dependencies. Besides explicit partnerships between publishers and institutional and domain repositories, there are other critical inter-dependencies – essentially implicit partnerships.Libraries as neutral connectors to sub-partners insystem development and collection buildinglinking with museums and archives
Development partners:DMPTool: U Va, Smithsonian, DCC, et alDataUp: MSRC, GBMF, D1 WAS: LC, UNT, NYU, et alUser partners (clients, patrons, customers): any
Partners: JISC/EDINA, paying customers on two continents
D1 network partners all over the world
partnering with escholarship and UC campuses for collection building
Partnering with JISC/EDINA, DataCite, the Research Data Alliance
Each member partners with regional data repositoriesDataCite partners with publishers (eg, T-R) for data citation indexCreditDiscoveryImpact trackingHelping data authors verify use of their data andHelping identify how others have used the dataWith archiving: re-use and reproducibility