1. Supporting
the Research Data Life Cycle
Joan Starr
@joan_starr
University of California Curation Center
California Digital Library
Columbia Research Data Symposium
2. Partnership between CDL | 10 UC campuses | Peer institutions
Provide solutions, services, resources for digital assets
Pool & distribute diverse experience, expertise, & resources
Columbia Research Data Symposium
3. A life cycle approach
Create, edit, share, and save
data management plans
Create and manage
plan
long-term identifiers
Open source add-in & Web app for manage
Microsoft Excel as a data collection tool
share
collect
Collect, manage, preserve and
publish websites and documents
Curation repository:
store, manage, and share research data
Open Access publishing services /
dynamic research platform
Columbia Research Data Symposium
4. A life cycle approach
Create, edit, share, and save
data management plans
Create and manage
plan
long-term identifiers
Open source add-in & Web app for
Microsoft Excel as a data collection tool
Collect, manage, preserve and
publish websites and documents
Curation repository:
store, manage, and share research data
Open Access publishing services /
dynamic research platform
Columbia Research Data Symposium
5. DMPTool
Meeting funding agencies data management plan requirements
• Connect researchers to resources
to create a data management plan
• NSF and directorates, NIH, NEH,
IMLS, foundations plus
• Customizable
Primary Functions
1. Step-by-step “wizard”
2. Templates and examples
3. Links to institutional resources and agency information
4. Plan publication and sharing
Data Curation for Practitioners Workshop
6. DMP Tool: https://dmp.cdlib.org/
Usage
3500 600
Number of Plans (solid) & Unique Users (dashed)
3000
500
2500
400
Number of Institutions
2000
300
1500
Unique Users 200
1000
Plans
Institutions 100
500
0 0
Oct-11 Dec-11 Feb-12 Apr-12 Jun-12 Aug-12
Data Curation for Practitioners Workshop
7. @ezidCDL
EZID
Long term identifiers made easy
• Precise identification of a dataset
(DOI or ARK)
• Credit to data producers and data
publishers
• A link from the traditional literature
to the data
• Exposure and research metrics for
datasets
(Web of Knowledge, Google)
Primary Functions
1. Create long term identifiers
2. Manage identifiers (and associated
metadata) over time
3. Resolve identifiers
Columbia Research Data Symposium
10. A life cycle approach
Create, edit, share, and save
data management plans
Create and manage
long-term identifiers
Open source add-in & Web app for
Microsoft Excel as a data collection tool
collect
Collect, manage, preserve and
publish websites and documents
Curation repository:
store, manage, and share research data
Open Access publishing services /
dynamic research platform
Columbia Research Data Symposium
11. @DataUpCDL
DataUp
Collect, share, archive, publish data
Primary Functions
1. An Excel 1) add-in & 2) cloud
application
2. Document data
3. Check for good data practices
3. Obtain identifier and citation
4. Archive and share
Columbia Research Data Symposium
12. DataUp: http://dataup.cdlib.org/
Researchers: How Frequently Do You Use Excel?
100%
90%
80%
70%
60% 5
Every day
50% 4
…
40% 3
2
30% Rarely
1
20% 55 Respondents
10%
0%
Undergrad Masters PhD grad postdoc Masters sci PhD sci
grad stud stud
Carly Strasser, CDL
Columbia Research Data Symposium
13. Web Archiving Service (WAS)
• ARCHIVE institution websites
• BUILD collections for research
• CAPTURE political and social events
• SAVE at-risk government websites
Primary Functions
1. Capture
2. Manage
3. Preserve
4. Publish
Columbia Research Data Symposium
14. WAS: http://webarchives.cdlib.org/
WAS Snapshot
54 public archives
120+ archives total
7,500+ sites
50+ TB
23 institutions
Columbia Research Data Symposium
15. A life cycle approach
plan
collect
manage
share
Columbia Research Data Symposium
16. For more information
UC3 Data Management Planning Resources
http://www.cdlib.org/services/uc3/dmp/index.html
Twitter: @ezidCDL and @DataUpCDL
Email: uc3@ucop.edu; washelp@ucop.edu
How to find me:
Twitter: @joan_starr
Email: joan.starr@ucop.edu
Columbia Research Data Symposium
Notas del editor
But first a very brief context setting.Serving the 10 UC campuses226,000 students 134,000 faculty and staff
What is a data management plan?A document that describes what you will do with your data duringandafter you complete your researchThe DMPTool“walks” scientists through the process of developing a concise, but comprehensive data management plan that could enable good stewardship of data and meet requirements of sponsors and home institutions.Partners: University of Virginia Library, University of Illinois at Urbana-Champaign Library, and DataONE, UCLA, UCSDThe California Digital Library and its partners were awarded a $590,000 grant from the Alfred P. Sloan Foundation to fund further development of the popular Data Management Planning Tool in 2013. The bulk of the grant will go to the UC Curation Center (UC3) at the CDL to fund improvements to the DMPTool including expanded functionality, training modules, documentation and the creation of an open-source community to sustain the DMPTool in the future. Project partners are the University of Virginia Library, University of Illinois at Urbana-Champaign Library, and DataONE
supplemental grant in the worksGot 2012 Digital Preservation Award recognition from Library of Congress
My colleague,Carly Strasser, the Service Manager for DataUp, spoke to about 200 researchers and these trends held.Key findings: No data preservationUnaware of archivesResistant to sharingPoor data documentation