SlideShare una empresa de Scribd logo
1 de 62
From Data Sharing to Data
Stewardship: Meeting Federal
Data Sharing Requirements
ACRL 2015
Thursday, March 26, 2015
ICPSR – University of Michigan
Hashtag: #icpsr
https://www.flickr.com/photos/29261037@N02/8896766525
https://www.flickr.com/photos/shawnhoke/6040690284
Direct identifiers
• Addresses, including ZIP and other postal codes
• Telephone numbers, including area codes
Indirect identifiers
• Exact dates of events (birth, death, marriage)
• Detailed income
• Detailed geographic information (e.g., county)
“The study is composed of about 180,000 autopsy x-
ray image files taken of 58 corpses. The images
originally arrived on DVD and are formatted to
comply with the Digital Imaging and
Communications in Medicine (DICOM) standard….
The images are the data of the study, the images
files themselves contain metadata (metadata on the
images) scrubbed of identifiers but there isn't much
in terms of documentation.”
http://www.wired.com/wp-content/uploads/2014/04/480815249-660x672.jpg
Today
• History (brief!) of federal data sharing requirements
• What is good data sharing? How do you achieve data
stewardship?
• Public data sharing services – tours & take-away tips
• Resources for creating data management plans and
funding quotes
You should leave this session with -
• Keen understanding of several sustainable data
sharing models
• Ability to assess data sharing services
– Through review of several services
– Walk-away tips for evaluating
• Knowledge (a portal) of resources for creating
data management plans for grant applications
• 50+ years of
experience
• Data stewardship
• Data management
• Data curation
• Data preservation
ICPSR
ICPSR Archives/Repositories
Recent Federal Data Sharing Initiatives
• NIH: 2003 – data sharing plans
• NSF: 2011 – data management plans
• OSTP: 2013 – Memo with subject “Increasing
Access to the Results of Federally Funded
Scientific Research”
https://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf
http://sites.nationalacademies.org/DBASSE/CurrentProjects/DBASSE_082378
http://www.icpsr.umich.edu/files/ICPSR/ICPSRComments.pdf
http://guides.library.oregonstate.edu/federaloa
http://bit.ly/FedOASummary
Data Portion of Memo - 13 Elements
• The elements are also summarized online
within ICPSR’s Web site:
http://icpsr.umich.edu/content/datamanagement/ostp.html
1.Maximize access
2.Protect confidentiality and privacy
3.Appropriate attribution
4.Long term preservation and
sustainability
5.Data management planning
UK results on data sharing attitudes
• In 2011 survey, 85% of researchers said they
thought their data would be of interest to
others.
• Only 41% said they would be happy to make
their data available.
• Only a third had previously published data.
Source: DaMaRO Project, University of Oxford
http://www.slideshare.net/DigCurv/15-meriel-patrick
Data Sharing Status
Federal
Agency
Shared
Formally,
Archived
(n=111)
Shared
Informally,
Not
Archived
(n=415)
Not
Shared
(n=409)
NSF
(27.3%)
22.4% 43.7% 33.9%
NIH
(72.7%)
7.4% 45.0% 47.6%
Total 11.5% 44.6% 43.9%
Pienta, Gutmann, & Lyle (2009). “Research Data in The Social Sciences: How Much is Being Shared?”
http://ori.hhs.gov/content/research-research-integrity-rri-conference-2009
See also: Pienta, Gutmann, Hoelter, Lyle, & Donakowski (2008). “The LEADS Database at ICPSR:
Identifying Important ‘At Risk’ Social Science Data.”
http://www.data-pass.org/sites/default/files/Pienta_et_al_2008.pdf
Pienta, Alter, & Lyle (2010). “The Enduring Value of Social Science Research: The
Use and Reuse of Primary Research Data”. http://hdl.handle.net/2027.42/78307
What is good data sharing - the basis of
data stewardship?
1.Maximize access
2.Protect confidentiality and privacy
3.Appropriate attribution
4.Long term preservation and sustainability
5.Data management planning
Maximize Access (Data Curation)
Discoverable
http://www.flickr.com/photos/papertrix/38028138/
Accessible
http://www.guardian.co.uk/science/grrlscientist/2012/mar/29/1
A well-prepared data collection
“contains information intended to
be complete and self-explanatory”
for future users.
Do no harm.
Protect confidentiality and privacy
• It is critically important to protect the identities of research
subjects
• Disclosure risk is a term that is often used for the possibility
that a data record from a study could be linked to a specific
person
• Data with these risks can be shared via a secured virtual
environment
• Data concerning very sensitive topics can also be shared via
a secured environment
Appropriate Attribution
• Properly citing data encourages the replication of
scientific results, improves research standards, guarantees
persistent reference, and gives proper credit to data
producers.
• Citing data is straightforward. Each citation must include
the basic elements that allow a unique dataset to be
identified over time: title, author, date, version, and
persistent identifier.
• Resources: ICPSR's Data Citations page , IASSIST's Quick
Guide to Data Citation, DataCite.
Long term preservation and sustainability
“Digital information lasts forever or five years,
whichever comes first”.
-Jeff Rothenberg
https://flic.kr/p/arHsh4
http://www.flickr.com/photos/blude/2665906010/
Data Management Planning
• Data management plans describe how researchers
will provide for long-term preservation of, and
access to, scientific data in digital formats.
• Data management plans provide opportunities for
researchers to manage and curate their data more
actively from project inception to completion.
• See ICPSR's resource: Guidelines for Effective Data
Management Plans
The Status of Data Sharing
– Good data sharing exists!
– Good data sharing requires funding -
sustainable funding!
– Sustainable funding for free public access
remains a challenge
Sustainable Data Sharing Models –
Three to Explore
• Fee for access model (subscription model)
• Agency model (agency or foundation funds
public access)
• Fee for deposit model (researcher writes fee
into grant and pays at deposit to fund public
access)
I. Fee-for-Access Data Sharing
• Funding is maintained by annual subscription fees charged to
institutions; individuals at subscribing institutions have free
(open) access to data
• Pooled (ongoing) subscriber fees are used to acquire, curate,
and maintain the service
• The service, open to everyone, is thus sustained by subscribers,
but agencies indicate these models are not ‘open enough’
because of the access fees
II. Agency-funded Data Sharing
• Agency sponsors/funds (ongoing) data curation & sharing enabling the
public to access without charge
• The archive is hosted with a curation entity like ICPSR where the public
can easily discover and access data and restricted-use data can also be
securely shared
• Agency directs data selection and compliance policies
III. Fee-for-Deposit Data Sharing
• Depositor (individual or entity) pays for data to be
curated and stored – a fee at deposit
• Deposit fees should be written into the grant
application
• Incoming deposit fees sustain the service and the
professionals behind it
• Sustainability risk fairly high in this model as it
depends upon:
– Continuous influx of deposit fees
– Depositors to put allocated fees towards curation & sharing
• Data tends to be bit-level (not curated): WIDIWYG
Fee for Deposit Services Arriving Daily!
(tips for evaluating coming shortly)
First: A Side-Note on Sharing
Restricted-Use Data
• Data with disclosure risk –
potential to identify a research
subject
• Data with highly sensitive
personal information
What is Restricted-Use Data?
Common Objection/Misperception:
“My data are too sensitive to share. . .”
• ICPSR has been sharing restricted-use data for
over a decade. Three methods are used:
– Secure Download
– Virtual Data Enclave
– Physical Enclave
• ICPSR stores & shares over 6,400 restricted-
use datasets associated with over 2,000
‘active’ restricted-use data agreements
Reality: Restricted-use data can be
effectively shared with the public
• Through the use of a virtual data enclave where
the data never leave the server
• Where there is a process (and understanding!)
to garner IRB approval from the requesting
scientist’s university
• Where there is a system, technology, data
professionals, and collaboration space in place
to disseminate (expensive to build!)
• Because agencies do allow for an incremental
charge to the data requestor to offset marginal
costs
Review of Public Data Sharing Services
• Overview of public data sharing services we have
reviewed
– Some key strengths of each
• Disclaimer: ICPSR has recently launched a public access
service (hosted)
– You’ll likely notice some bias when we talk about the
strengths of openICPSR
– And because we built the service, we know much more
about it
– Still, ICPSR’s public access service isn’t for everyone –
more on that shortly
Public Data Sharing Services
openICPSR – www.openicpsr.org
How is openICPSR unique?
openICPSR is a public data-sharing service:
• Where the deposit is reviewed by professional data curators who
are experts in developing metadata (tags) for the social and
behavioral sciences = discoverable
• With an immediate distribution network of over 750 institutions
looking for research data, that has powerful search tools, and a
data catalog indexed by major search engines = usage
• Sustained by a respected organization with over 50 years of
experience in reliably protecting research data = sustainable
• Prepared to accept and disseminate sensitive and/or restricted-
use data in the public-access environment = protection of research
subjects
How will openICPSR disseminate sensitive
data to the public?
• The deposit of sensitive (restricted-use) data is similar
to the deposit of non-sensitive data except that the
depositor will indicate that the data should be for
restricted-use only
• Dissemination of sensitive data will be through
ICPSR’s virtual data enclave; in this environment, data
never leave the secure server and analysis takes place
in the virtual space
• Scientists desiring to access the data will need to
apply for the data and will pay an access fee
• openICPSR has already received sensitive (restricted-
use) and dissemination of these data has begun
openICPSR for Institutions and Journals
• Uses openICPSR platform
• Fully hosted in the ICPSR
cloud – no tech or patches
needed
• Branded with a logo and
colors
• Deposits incorporated into
ICPSR’s data catalog
• On-demand administrative
usage tools
A final note: openICPSR accepts research data from
a wide array of disciplines/fields, but not all
Tips for Evaluating a Data Sharing Service
• How will the service sustain itself? Does it have a long term funding
stream?
• How will the service care for my data in the long term should the service
fail? Is there a plan? A safety net?
• Can the service quickly maximize discoverability of my data? Does it
explain how it will do so?
• Does the service have a network of interested researchers & students
seeking data? Will my data get used?
• Does the service have knowledge of international archiving standards?
• Does the service provide a DOI, data citation, and version control should I
need to update my files?
• I have sensitive data or data with some disclosure risk to deposit. Does
the service understand how to secure it upon intake and when sharing?
Does it have experience in this area?
Questions to consider when selecting a data sharing service:
Resources for Creating Data
Management Plans for Grant
Applications
ICPSR’s Data Management & Curation Site
http://www.icpsr.umich.edu/datamanagement/
Purpose of Data Management Plans
• Data management plans describe how researchers
will provide for long-term preservation of, and
access to, scientific data in digital formats.
• Data management plans provide opportunities for
researchers to manage and curate their data more
actively from project inception to completion.
Data Management
Plan Resources
DMP Template Tool to Get You Started!
Guidelines for Download
And still more guidelines after the
project is awarded:
• Guide emphasizes
preparation for data
sharing throughout
the project
• Available online and
via download (pdf)
ICPSR Data Curation Training Workshops
• 1-5 day workshops on data curation/data
repository management decisions
– Participants learn about best practices and
tools for data curation, from selecting and
preparing data for archiving to optimizing and
promoting data for reuse
• Available via ICPSR Summer Program (Ann
Arbor – July 27-31, 2015) or onsite at your
institution
Copies of these Slides & Use
• Feel free to share it; present
it; cite it!
• Find copies of these slides
on Slideshare.net
– Several notes and
additional links are found in
the notes view
Get More information
• Visit ICPSR’s Data Management &
Curation site:
http://www.icpsr.umich.edu/datamanage
ment/index.jsp
• Contact us:
– netmail@icpsr.umich.edu
– (734) 647-2200
• More on Assuring Access to
Scientific Data: white paper –
“Sustaining Domain Repositories
for Digital Data”

Más contenido relacionado

La actualidad más candente

Data Services presentation for Psychology
Data Services presentation for PsychologyData Services presentation for Psychology
Data Services presentation for PsychologyLynda Kellam
 
Publishing perspectives on data management & future directions
Publishing perspectives on data management & future directionsPublishing perspectives on data management & future directions
Publishing perspectives on data management & future directionsARDC
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Managementaaroncollie
 
Slides | Targeting the librarian’s role in research services
Slides | Targeting the librarian’s role in research servicesSlides | Targeting the librarian’s role in research services
Slides | Targeting the librarian’s role in research servicesLibrary_Connect
 
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...ASIS&T
 
Research Data Management for SOE
Research Data Management for SOEResearch Data Management for SOE
Research Data Management for SOELynda Kellam
 
RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel ASIS&T
 
Data Sharing & Data Citation
Data Sharing & Data CitationData Sharing & Data Citation
Data Sharing & Data CitationMicah Altman
 
RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...
RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...
RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...ASIS&T
 
RDAP 16: DMPs and Public Access: Agency and Data Service Experiences
RDAP 16: DMPs and Public Access: Agency and Data Service ExperiencesRDAP 16: DMPs and Public Access: Agency and Data Service Experiences
RDAP 16: DMPs and Public Access: Agency and Data Service ExperiencesASIS&T
 
Data Services/ICPSR presentation for School of Education
Data Services/ICPSR presentation for School of EducationData Services/ICPSR presentation for School of Education
Data Services/ICPSR presentation for School of EducationLynda Kellam
 
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017ARDC
 
Why Data Citation Currently Misses the Point
Why Data Citation Currently Misses the PointWhy Data Citation Currently Misses the Point
Why Data Citation Currently Misses the PointMark Parsons
 
Slides | Research data literacy and the library
Slides | Research data literacy and the librarySlides | Research data literacy and the library
Slides | Research data literacy and the libraryColleen DeLory
 
RDAP 15 Navigating the Rocky Road to Research Data Acceptance
RDAP 15 Navigating the Rocky Road to Research Data AcceptanceRDAP 15 Navigating the Rocky Road to Research Data Acceptance
RDAP 15 Navigating the Rocky Road to Research Data AcceptanceASIS&T
 
Poster RDAP13: Research Data in eCommons @ Cornell: Present and Future
Poster RDAP13: Research Data in eCommons @ Cornell: Present and FuturePoster RDAP13: Research Data in eCommons @ Cornell: Present and Future
Poster RDAP13: Research Data in eCommons @ Cornell: Present and FutureASIS&T
 

La actualidad más candente (20)

Levine - Data Curation; Ethics and Legal Considerations
Levine - Data Curation; Ethics and Legal ConsiderationsLevine - Data Curation; Ethics and Legal Considerations
Levine - Data Curation; Ethics and Legal Considerations
 
Data Services presentation for Psychology
Data Services presentation for PsychologyData Services presentation for Psychology
Data Services presentation for Psychology
 
Publishing perspectives on data management & future directions
Publishing perspectives on data management & future directionsPublishing perspectives on data management & future directions
Publishing perspectives on data management & future directions
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Slides | Targeting the librarian’s role in research services
Slides | Targeting the librarian’s role in research servicesSlides | Targeting the librarian’s role in research services
Slides | Targeting the librarian’s role in research services
 
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
 
Research Data Management for SOE
Research Data Management for SOEResearch Data Management for SOE
Research Data Management for SOE
 
RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel
 
Data Sharing & Data Citation
Data Sharing & Data CitationData Sharing & Data Citation
Data Sharing & Data Citation
 
RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...
RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...
RDAP14: Policy Recommendations for Institutions to Serve as Trustworthy Stewa...
 
RDAP 16: DMPs and Public Access: Agency and Data Service Experiences
RDAP 16: DMPs and Public Access: Agency and Data Service ExperiencesRDAP 16: DMPs and Public Access: Agency and Data Service Experiences
RDAP 16: DMPs and Public Access: Agency and Data Service Experiences
 
Data Services/ICPSR presentation for School of Education
Data Services/ICPSR presentation for School of EducationData Services/ICPSR presentation for School of Education
Data Services/ICPSR presentation for School of Education
 
NISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management PlanNISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management Plan
 
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
 
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-researchUc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
 
Why Data Citation Currently Misses the Point
Why Data Citation Currently Misses the PointWhy Data Citation Currently Misses the Point
Why Data Citation Currently Misses the Point
 
Slides | Research data literacy and the library
Slides | Research data literacy and the librarySlides | Research data literacy and the library
Slides | Research data literacy and the library
 
RDAP 15 Navigating the Rocky Road to Research Data Acceptance
RDAP 15 Navigating the Rocky Road to Research Data AcceptanceRDAP 15 Navigating the Rocky Road to Research Data Acceptance
RDAP 15 Navigating the Rocky Road to Research Data Acceptance
 
Poster RDAP13: Research Data in eCommons @ Cornell: Present and Future
Poster RDAP13: Research Data in eCommons @ Cornell: Present and FuturePoster RDAP13: Research Data in eCommons @ Cornell: Present and Future
Poster RDAP13: Research Data in eCommons @ Cornell: Present and Future
 

Similar a Meeting Federal Research Requirements

ICPSR Workshop Template - 2012/13
ICPSR Workshop Template - 2012/13ICPSR Workshop Template - 2012/13
ICPSR Workshop Template - 2012/13ICPSR
 
ICPSR Data Services
ICPSR Data ServicesICPSR Data Services
ICPSR Data ServicesICPSR
 
The art of depositing social science data: maximising quality and ensuring go...
The art of depositing social science data: maximising quality and ensuring go...The art of depositing social science data: maximising quality and ensuring go...
The art of depositing social science data: maximising quality and ensuring go...Louise Corti
 
Adding valuethroughdatacuration
Adding valuethroughdatacurationAdding valuethroughdatacuration
Adding valuethroughdatacurationAPLICwebmaster
 
Cal Poly - Data Management and the DMPTool
Cal Poly - Data Management and the DMPToolCal Poly - Data Management and the DMPTool
Cal Poly - Data Management and the DMPToolCarly Strasser
 
DMP health sciences
DMP health sciencesDMP health sciences
DMP health sciencesSarah Jones
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE
 
RDM LIASA webinar
RDM LIASA webinarRDM LIASA webinar
RDM LIASA webinarSarah Jones
 
Data Policy for Open Science
Data Policy for Open ScienceData Policy for Open Science
Data Policy for Open ScienceMark Parsons
 
Magle data curation in libraries
Magle data curation in librariesMagle data curation in libraries
Magle data curation in librariesC. Tobin Magle
 
Data management profiles workshop
Data management profiles workshopData management profiles workshop
Data management profiles workshoplindahauck
 
NIH Data Policy or: How I Learned to Stop Worrying and Love the Data Manageme...
NIH Data Policy or: How I Learned to Stop Worrying and Love the Data Manageme...NIH Data Policy or: How I Learned to Stop Worrying and Love the Data Manageme...
NIH Data Policy or: How I Learned to Stop Worrying and Love the Data Manageme...Kristin Briney
 
Ada slide presentation rsc day_feb2017_v2
Ada slide presentation rsc day_feb2017_v2Ada slide presentation rsc day_feb2017_v2
Ada slide presentation rsc day_feb2017_v2SusanMRob
 
Data management woolfrey
Data management woolfreyData management woolfrey
Data management woolfreypvhead123
 
Data Governance in two different data archives: When is a federal data reposi...
Data Governance in two different data archives: When is a federal data reposi...Data Governance in two different data archives: When is a federal data reposi...
Data Governance in two different data archives: When is a federal data reposi...Carolyn Ten Holter
 
2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...
2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...
2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...datacite
 
Data Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn WoolfreyData Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn Woolfreypvhead123
 

Similar a Meeting Federal Research Requirements (20)

ICPSR Workshop Template - 2012/13
ICPSR Workshop Template - 2012/13ICPSR Workshop Template - 2012/13
ICPSR Workshop Template - 2012/13
 
ICPSR Data Services
ICPSR Data ServicesICPSR Data Services
ICPSR Data Services
 
The art of depositing social science data: maximising quality and ensuring go...
The art of depositing social science data: maximising quality and ensuring go...The art of depositing social science data: maximising quality and ensuring go...
The art of depositing social science data: maximising quality and ensuring go...
 
Adding valuethroughdatacuration
Adding valuethroughdatacurationAdding valuethroughdatacuration
Adding valuethroughdatacuration
 
Cal Poly - Data Management and the DMPTool
Cal Poly - Data Management and the DMPToolCal Poly - Data Management and the DMPTool
Cal Poly - Data Management and the DMPTool
 
DMP health sciences
DMP health sciencesDMP health sciences
DMP health sciences
 
DataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data SharingDataONE Education Module 02: Data Sharing
DataONE Education Module 02: Data Sharing
 
RDM LIASA webinar
RDM LIASA webinarRDM LIASA webinar
RDM LIASA webinar
 
Data Policy for Open Science
Data Policy for Open ScienceData Policy for Open Science
Data Policy for Open Science
 
Data Policy for Open Science
Data Policy for Open ScienceData Policy for Open Science
Data Policy for Open Science
 
Magle data curation in libraries
Magle data curation in librariesMagle data curation in libraries
Magle data curation in libraries
 
Data management profiles workshop
Data management profiles workshopData management profiles workshop
Data management profiles workshop
 
NIH Data Policy or: How I Learned to Stop Worrying and Love the Data Manageme...
NIH Data Policy or: How I Learned to Stop Worrying and Love the Data Manageme...NIH Data Policy or: How I Learned to Stop Worrying and Love the Data Manageme...
NIH Data Policy or: How I Learned to Stop Worrying and Love the Data Manageme...
 
Ada slide presentation rsc day_feb2017_v2
Ada slide presentation rsc day_feb2017_v2Ada slide presentation rsc day_feb2017_v2
Ada slide presentation rsc day_feb2017_v2
 
Data management woolfrey
Data management woolfreyData management woolfrey
Data management woolfrey
 
Data Governance in two different data archives: When is a federal data reposi...
Data Governance in two different data archives: When is a federal data reposi...Data Governance in two different data archives: When is a federal data reposi...
Data Governance in two different data archives: When is a federal data reposi...
 
2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...
2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...
2013 DataCite Summer Meeting - Closing Keynote: Building Community Engagement...
 
Data Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn WoolfreyData Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn Woolfrey
 
Ratan "Are we there yet? Keeping the promise of open science"
Ratan "Are we there yet?  Keeping the promise of open science"Ratan "Are we there yet?  Keeping the promise of open science"
Ratan "Are we there yet? Keeping the promise of open science"
 
BLC & Digital Science: Mark Hahnel, Figshare
BLC & Digital Science: Mark Hahnel, FigshareBLC & Digital Science: Mark Hahnel, Figshare
BLC & Digital Science: Mark Hahnel, Figshare
 

Más de ICPSR

Asa integrating data 2 19-2014 with cites
Asa integrating data 2 19-2014 with citesAsa integrating data 2 19-2014 with cites
Asa integrating data 2 19-2014 with citesICPSR
 
Data in the HS Classroom: When, Why, and How?
Data in the HS Classroom: When, Why, and How?Data in the HS Classroom: When, Why, and How?
Data in the HS Classroom: When, Why, and How?ICPSR
 
ICPSR Secure Data Service: Broadening Access. Reducing Risk.
ICPSR Secure Data Service: Broadening Access. Reducing Risk.ICPSR Secure Data Service: Broadening Access. Reducing Risk.
ICPSR Secure Data Service: Broadening Access. Reducing Risk.ICPSR
 
Data in The Classroom: It's Not Just for Nerds Anymore!
Data in The Classroom:  It's Not Just for Nerds Anymore!Data in The Classroom:  It's Not Just for Nerds Anymore!
Data in The Classroom: It's Not Just for Nerds Anymore!ICPSR
 
Quantitative Literacy: Don't be afraid of data (in the classroom)!
Quantitative Literacy:  Don't be afraid of data (in the classroom)!Quantitative Literacy:  Don't be afraid of data (in the classroom)!
Quantitative Literacy: Don't be afraid of data (in the classroom)!ICPSR
 
TeachingWithData.org Outreach Presentation
TeachingWithData.org Outreach Presentation TeachingWithData.org Outreach Presentation
TeachingWithData.org Outreach Presentation ICPSR
 
ICPSR Data Managment
ICPSR Data ManagmentICPSR Data Managment
ICPSR Data ManagmentICPSR
 
ICPSR Data Sharing
ICPSR Data SharingICPSR Data Sharing
ICPSR Data SharingICPSR
 
Spice up your lecture with Inquiry-based Learning
Spice up your lecture with Inquiry-based LearningSpice up your lecture with Inquiry-based Learning
Spice up your lecture with Inquiry-based LearningICPSR
 
Guidance on Data Management Plans
Guidance on Data Management PlansGuidance on Data Management Plans
Guidance on Data Management PlansICPSR
 
TeachingWithData.org ASA Presentation 2010
TeachingWithData.org ASA Presentation 2010TeachingWithData.org ASA Presentation 2010
TeachingWithData.org ASA Presentation 2010ICPSR
 
TeachingWithData.org -- Faculty Presentation
TeachingWithData.org -- Faculty PresentationTeachingWithData.org -- Faculty Presentation
TeachingWithData.org -- Faculty PresentationICPSR
 
Bulletinspring2010final
Bulletinspring2010finalBulletinspring2010final
Bulletinspring2010finalICPSR
 
ICPSR: Resources for Use in Undergraduate Instruction
ICPSR: Resources for Use in Undergraduate InstructionICPSR: Resources for Use in Undergraduate Instruction
ICPSR: Resources for Use in Undergraduate InstructionICPSR
 
Using Quantitative Data in Teaching: ICPSR Resources
Using Quantitative Data in Teaching: ICPSR ResourcesUsing Quantitative Data in Teaching: ICPSR Resources
Using Quantitative Data in Teaching: ICPSR ResourcesICPSR
 
What Is A Virtual Meeting?
What Is A Virtual Meeting?What Is A Virtual Meeting?
What Is A Virtual Meeting?ICPSR
 

Más de ICPSR (16)

Asa integrating data 2 19-2014 with cites
Asa integrating data 2 19-2014 with citesAsa integrating data 2 19-2014 with cites
Asa integrating data 2 19-2014 with cites
 
Data in the HS Classroom: When, Why, and How?
Data in the HS Classroom: When, Why, and How?Data in the HS Classroom: When, Why, and How?
Data in the HS Classroom: When, Why, and How?
 
ICPSR Secure Data Service: Broadening Access. Reducing Risk.
ICPSR Secure Data Service: Broadening Access. Reducing Risk.ICPSR Secure Data Service: Broadening Access. Reducing Risk.
ICPSR Secure Data Service: Broadening Access. Reducing Risk.
 
Data in The Classroom: It's Not Just for Nerds Anymore!
Data in The Classroom:  It's Not Just for Nerds Anymore!Data in The Classroom:  It's Not Just for Nerds Anymore!
Data in The Classroom: It's Not Just for Nerds Anymore!
 
Quantitative Literacy: Don't be afraid of data (in the classroom)!
Quantitative Literacy:  Don't be afraid of data (in the classroom)!Quantitative Literacy:  Don't be afraid of data (in the classroom)!
Quantitative Literacy: Don't be afraid of data (in the classroom)!
 
TeachingWithData.org Outreach Presentation
TeachingWithData.org Outreach Presentation TeachingWithData.org Outreach Presentation
TeachingWithData.org Outreach Presentation
 
ICPSR Data Managment
ICPSR Data ManagmentICPSR Data Managment
ICPSR Data Managment
 
ICPSR Data Sharing
ICPSR Data SharingICPSR Data Sharing
ICPSR Data Sharing
 
Spice up your lecture with Inquiry-based Learning
Spice up your lecture with Inquiry-based LearningSpice up your lecture with Inquiry-based Learning
Spice up your lecture with Inquiry-based Learning
 
Guidance on Data Management Plans
Guidance on Data Management PlansGuidance on Data Management Plans
Guidance on Data Management Plans
 
TeachingWithData.org ASA Presentation 2010
TeachingWithData.org ASA Presentation 2010TeachingWithData.org ASA Presentation 2010
TeachingWithData.org ASA Presentation 2010
 
TeachingWithData.org -- Faculty Presentation
TeachingWithData.org -- Faculty PresentationTeachingWithData.org -- Faculty Presentation
TeachingWithData.org -- Faculty Presentation
 
Bulletinspring2010final
Bulletinspring2010finalBulletinspring2010final
Bulletinspring2010final
 
ICPSR: Resources for Use in Undergraduate Instruction
ICPSR: Resources for Use in Undergraduate InstructionICPSR: Resources for Use in Undergraduate Instruction
ICPSR: Resources for Use in Undergraduate Instruction
 
Using Quantitative Data in Teaching: ICPSR Resources
Using Quantitative Data in Teaching: ICPSR ResourcesUsing Quantitative Data in Teaching: ICPSR Resources
Using Quantitative Data in Teaching: ICPSR Resources
 
What Is A Virtual Meeting?
What Is A Virtual Meeting?What Is A Virtual Meeting?
What Is A Virtual Meeting?
 

Último

SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 

Último (20)

SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 

Meeting Federal Research Requirements

  • 1. From Data Sharing to Data Stewardship: Meeting Federal Data Sharing Requirements ACRL 2015 Thursday, March 26, 2015 ICPSR – University of Michigan Hashtag: #icpsr
  • 4.
  • 5. Direct identifiers • Addresses, including ZIP and other postal codes • Telephone numbers, including area codes Indirect identifiers • Exact dates of events (birth, death, marriage) • Detailed income • Detailed geographic information (e.g., county)
  • 6. “The study is composed of about 180,000 autopsy x- ray image files taken of 58 corpses. The images originally arrived on DVD and are formatted to comply with the Digital Imaging and Communications in Medicine (DICOM) standard…. The images are the data of the study, the images files themselves contain metadata (metadata on the images) scrubbed of identifiers but there isn't much in terms of documentation.”
  • 8. Today • History (brief!) of federal data sharing requirements • What is good data sharing? How do you achieve data stewardship? • Public data sharing services – tours & take-away tips • Resources for creating data management plans and funding quotes
  • 9. You should leave this session with - • Keen understanding of several sustainable data sharing models • Ability to assess data sharing services – Through review of several services – Walk-away tips for evaluating • Knowledge (a portal) of resources for creating data management plans for grant applications
  • 10. • 50+ years of experience • Data stewardship • Data management • Data curation • Data preservation ICPSR
  • 12. Recent Federal Data Sharing Initiatives • NIH: 2003 – data sharing plans • NSF: 2011 – data management plans • OSTP: 2013 – Memo with subject “Increasing Access to the Results of Federally Funded Scientific Research”
  • 13.
  • 19. Data Portion of Memo - 13 Elements • The elements are also summarized online within ICPSR’s Web site: http://icpsr.umich.edu/content/datamanagement/ostp.html
  • 20. 1.Maximize access 2.Protect confidentiality and privacy 3.Appropriate attribution 4.Long term preservation and sustainability 5.Data management planning
  • 21. UK results on data sharing attitudes • In 2011 survey, 85% of researchers said they thought their data would be of interest to others. • Only 41% said they would be happy to make their data available. • Only a third had previously published data. Source: DaMaRO Project, University of Oxford http://www.slideshare.net/DigCurv/15-meriel-patrick
  • 22. Data Sharing Status Federal Agency Shared Formally, Archived (n=111) Shared Informally, Not Archived (n=415) Not Shared (n=409) NSF (27.3%) 22.4% 43.7% 33.9% NIH (72.7%) 7.4% 45.0% 47.6% Total 11.5% 44.6% 43.9% Pienta, Gutmann, & Lyle (2009). “Research Data in The Social Sciences: How Much is Being Shared?” http://ori.hhs.gov/content/research-research-integrity-rri-conference-2009 See also: Pienta, Gutmann, Hoelter, Lyle, & Donakowski (2008). “The LEADS Database at ICPSR: Identifying Important ‘At Risk’ Social Science Data.” http://www.data-pass.org/sites/default/files/Pienta_et_al_2008.pdf Pienta, Alter, & Lyle (2010). “The Enduring Value of Social Science Research: The Use and Reuse of Primary Research Data”. http://hdl.handle.net/2027.42/78307
  • 23. What is good data sharing - the basis of data stewardship? 1.Maximize access 2.Protect confidentiality and privacy 3.Appropriate attribution 4.Long term preservation and sustainability 5.Data management planning
  • 27. A well-prepared data collection “contains information intended to be complete and self-explanatory” for future users. Do no harm.
  • 28. Protect confidentiality and privacy • It is critically important to protect the identities of research subjects • Disclosure risk is a term that is often used for the possibility that a data record from a study could be linked to a specific person • Data with these risks can be shared via a secured virtual environment • Data concerning very sensitive topics can also be shared via a secured environment
  • 29. Appropriate Attribution • Properly citing data encourages the replication of scientific results, improves research standards, guarantees persistent reference, and gives proper credit to data producers. • Citing data is straightforward. Each citation must include the basic elements that allow a unique dataset to be identified over time: title, author, date, version, and persistent identifier. • Resources: ICPSR's Data Citations page , IASSIST's Quick Guide to Data Citation, DataCite.
  • 30. Long term preservation and sustainability
  • 31. “Digital information lasts forever or five years, whichever comes first”. -Jeff Rothenberg
  • 34.
  • 35. Data Management Planning • Data management plans describe how researchers will provide for long-term preservation of, and access to, scientific data in digital formats. • Data management plans provide opportunities for researchers to manage and curate their data more actively from project inception to completion. • See ICPSR's resource: Guidelines for Effective Data Management Plans
  • 36. The Status of Data Sharing – Good data sharing exists! – Good data sharing requires funding - sustainable funding! – Sustainable funding for free public access remains a challenge
  • 37. Sustainable Data Sharing Models – Three to Explore • Fee for access model (subscription model) • Agency model (agency or foundation funds public access) • Fee for deposit model (researcher writes fee into grant and pays at deposit to fund public access)
  • 38. I. Fee-for-Access Data Sharing • Funding is maintained by annual subscription fees charged to institutions; individuals at subscribing institutions have free (open) access to data • Pooled (ongoing) subscriber fees are used to acquire, curate, and maintain the service • The service, open to everyone, is thus sustained by subscribers, but agencies indicate these models are not ‘open enough’ because of the access fees
  • 39. II. Agency-funded Data Sharing • Agency sponsors/funds (ongoing) data curation & sharing enabling the public to access without charge • The archive is hosted with a curation entity like ICPSR where the public can easily discover and access data and restricted-use data can also be securely shared • Agency directs data selection and compliance policies
  • 40. III. Fee-for-Deposit Data Sharing • Depositor (individual or entity) pays for data to be curated and stored – a fee at deposit • Deposit fees should be written into the grant application • Incoming deposit fees sustain the service and the professionals behind it • Sustainability risk fairly high in this model as it depends upon: – Continuous influx of deposit fees – Depositors to put allocated fees towards curation & sharing • Data tends to be bit-level (not curated): WIDIWYG
  • 41. Fee for Deposit Services Arriving Daily! (tips for evaluating coming shortly)
  • 42. First: A Side-Note on Sharing Restricted-Use Data • Data with disclosure risk – potential to identify a research subject • Data with highly sensitive personal information What is Restricted-Use Data?
  • 43. Common Objection/Misperception: “My data are too sensitive to share. . .” • ICPSR has been sharing restricted-use data for over a decade. Three methods are used: – Secure Download – Virtual Data Enclave – Physical Enclave • ICPSR stores & shares over 6,400 restricted- use datasets associated with over 2,000 ‘active’ restricted-use data agreements
  • 44. Reality: Restricted-use data can be effectively shared with the public • Through the use of a virtual data enclave where the data never leave the server • Where there is a process (and understanding!) to garner IRB approval from the requesting scientist’s university • Where there is a system, technology, data professionals, and collaboration space in place to disseminate (expensive to build!) • Because agencies do allow for an incremental charge to the data requestor to offset marginal costs
  • 45. Review of Public Data Sharing Services • Overview of public data sharing services we have reviewed – Some key strengths of each • Disclaimer: ICPSR has recently launched a public access service (hosted) – You’ll likely notice some bias when we talk about the strengths of openICPSR – And because we built the service, we know much more about it – Still, ICPSR’s public access service isn’t for everyone – more on that shortly
  • 48. How is openICPSR unique? openICPSR is a public data-sharing service: • Where the deposit is reviewed by professional data curators who are experts in developing metadata (tags) for the social and behavioral sciences = discoverable • With an immediate distribution network of over 750 institutions looking for research data, that has powerful search tools, and a data catalog indexed by major search engines = usage • Sustained by a respected organization with over 50 years of experience in reliably protecting research data = sustainable • Prepared to accept and disseminate sensitive and/or restricted- use data in the public-access environment = protection of research subjects
  • 49. How will openICPSR disseminate sensitive data to the public? • The deposit of sensitive (restricted-use) data is similar to the deposit of non-sensitive data except that the depositor will indicate that the data should be for restricted-use only • Dissemination of sensitive data will be through ICPSR’s virtual data enclave; in this environment, data never leave the secure server and analysis takes place in the virtual space • Scientists desiring to access the data will need to apply for the data and will pay an access fee • openICPSR has already received sensitive (restricted- use) and dissemination of these data has begun
  • 50. openICPSR for Institutions and Journals • Uses openICPSR platform • Fully hosted in the ICPSR cloud – no tech or patches needed • Branded with a logo and colors • Deposits incorporated into ICPSR’s data catalog • On-demand administrative usage tools
  • 51. A final note: openICPSR accepts research data from a wide array of disciplines/fields, but not all
  • 52. Tips for Evaluating a Data Sharing Service • How will the service sustain itself? Does it have a long term funding stream? • How will the service care for my data in the long term should the service fail? Is there a plan? A safety net? • Can the service quickly maximize discoverability of my data? Does it explain how it will do so? • Does the service have a network of interested researchers & students seeking data? Will my data get used? • Does the service have knowledge of international archiving standards? • Does the service provide a DOI, data citation, and version control should I need to update my files? • I have sensitive data or data with some disclosure risk to deposit. Does the service understand how to secure it upon intake and when sharing? Does it have experience in this area? Questions to consider when selecting a data sharing service:
  • 53. Resources for Creating Data Management Plans for Grant Applications
  • 54. ICPSR’s Data Management & Curation Site http://www.icpsr.umich.edu/datamanagement/
  • 55. Purpose of Data Management Plans • Data management plans describe how researchers will provide for long-term preservation of, and access to, scientific data in digital formats. • Data management plans provide opportunities for researchers to manage and curate their data more actively from project inception to completion.
  • 57. DMP Template Tool to Get You Started!
  • 59. And still more guidelines after the project is awarded: • Guide emphasizes preparation for data sharing throughout the project • Available online and via download (pdf)
  • 60. ICPSR Data Curation Training Workshops • 1-5 day workshops on data curation/data repository management decisions – Participants learn about best practices and tools for data curation, from selecting and preparing data for archiving to optimizing and promoting data for reuse • Available via ICPSR Summer Program (Ann Arbor – July 27-31, 2015) or onsite at your institution
  • 61. Copies of these Slides & Use • Feel free to share it; present it; cite it! • Find copies of these slides on Slideshare.net – Several notes and additional links are found in the notes view
  • 62. Get More information • Visit ICPSR’s Data Management & Curation site: http://www.icpsr.umich.edu/datamanage ment/index.jsp • Contact us: – netmail@icpsr.umich.edu – (734) 647-2200 • More on Assuring Access to Scientific Data: white paper – “Sustaining Domain Repositories for Digital Data”

Notas del editor

  1. Federal agencies are requiring data management plans as part of research proposals to increase public access to results (including research data) of federally funded scientific research. Join us for a session on sustainable data sharing models, including models for sharing restricted-use data. Demos of these models and tips for accessing hosted public data access services will be provided as well as resources for creating data management plans for grant applications.
  2. Here’s the wave of ‘big data’.
  3. Source of slide: Myron Gutmann’s IDF Meeting (June, 2007) ICPSR exists to preserve and share research data to support researchers who: Write research articles, books, and papers Teach or utilize quantitative methods Write grant/contract proposals (require data management plans)
  4. Current archives/collections/repositories already meeting public access requirements regarding data NACDA – NACJD – SAMHDA: examples of long term sustainability NAHDAP – SAMHDA – DSDR: examples of sharing of confidential data NACJD – example of depository/researcher compliance (holding 10% of funding to PI) LGBT – MET: unique infrastructure and dissemination Research Connections: reports and data dissemination; audiences including policymakers
  5. In January 2011, the National Science Foundation released a new requirement for proposal submissions regarding the management of data generated using NSF support. All proposals must now include a data management plan (DMP). (NIH has similar DMP requirements.) The plan is to be short, no more than two pages, and is submitted as a supplementary document. The plan needs to address two main topics: What data are generated by your research? What is your plan for managing the data? The OSTP Memo This memo  directed funding agencies with an annual R&D budget over $100 million to develop a public access plan for disseminating the results of their research concern for investment: “Policies that mobilize these publications and data for re-use through preservation and broader public access also maximize the impact and accountability of the Federal research investment.” Federal agencies with over $100 M annually in R&D expenditures to develop plans to support increased public access to the results of research funded by the Federal Government
  6. The OSTP Memo – Overview Released February 22, 2013 A concern for investment: “Policies that mobilize these publications and data for re-use through preservation and broader public access also maximize the impact and accountability of the Federal research investment.” Federal agencies with over $100 M annually in R&D expenditures to develop plans to support increased public access to the results of research funded by the Federal Government
  7. “Maximize access, by the general public and without charge, to digitally formatted scientific data created with Federal funds…”
  8. http://www.slideshare.net/DigCurv/15-meriel-patrick
  9. 4,883 NIH & NSF PIs emailed a survey 1,217 responses (24.9% response rate) 1,003 valid (collected data, not disseratation) We attempted to invite all 4,883 of these Pis. The PI survey consisted of consisted of questions about research data collected, various methods for sharing research data, attitudes about data sharing and demographic information. PIs were also asked about publications tied to the research project including information about their own publications, research team publications, and publications outside the research team. We received 1,217 responses (24.9% response rate). For the analytic sample we select PIs and their research data if (1) they confirm they collected research data (86.6% of the responses), (2) they did not collect data for a dissertation award (n=33), or (3) they were missing data on the dependent variable.
  10. Today we’ll talk about how to prepare your data collection so – and this is the ultimate litmus goal -- it “contains information intended to be complete and self-explanatory” for future users. [Quote is from the National Longitudinal Survey of Youth’s explanation of its documentation (see: http://www.nlsinfo.org/nlsy97/97guide/chap3.htm#threethree).] Why does this matter? 1) Others will be able to independently use/understand data, 2) Data will be readable (i.e., in useable formats) in the future, 3) It makes your life less complicated once you’re finished with the data collection -- you don’t need to continually explain, reformat, revise, etc. This isn’t rocket science, but it’s still important. I recognize that many watching this Webinar have extensive data backgrounds, so I’m going to convey the information as quickly and directly as possible.
  11. Cuneiform tablet
  12. EBCDIC format
  13. Such is the dilemma! Good data curation is sustained by subscriber fees that pay for good documentation, data cleaning, rendering into accessible formats, preservation including file migration and production of ASCII files, and sustained storage and website delivery (and all of the data and tech professionals conducting that work). However, this model has been determined at this time not to be open ‘enough.’
  14. This model is a fantastic solution to sustainability – providing the agency continues to allot funds for data curation and sharing. Access to the public is free with this model. The agency pays organizations like ICPSR for data and tech professionals to process, preserve, and share their data. The agency sets the rules for what data it will include in its archive (sets the data selection policy) since not all data can/should be curated. The agency can also set rules for compliance to encourage researchers to deposit and share their data. One agency requires that ICPSR provide confirmation that data and documentation have been received and are in workable shape prior to releasing the final payment to the researcher. Unfortunately this is rare in terms of compliance, as it works! The risk to sustainability is not zero however, since budget cuts to federal funding are always of concern.
  15. The signals coming from federal agencies regarding public access as well as international momentum are resulting in a rush to provide solutions for open (free) public access to data. Some entities are providing ‘open source’ solutions where an organization grabs the code & the tech group builds the repository out to its desire. (They’re expected to share this code back with the community later.) It makes sense that later these entities might share a catalog – otherwise, how these data will really get discovered (required for use of the data!), is a mystery. Other entities are providing ‘fully hosted’ solutions where an organization or individual deposits data into the cloud (servers) of the hosting organization. This is the solution we’ll concentrate on when providing tips for evaluating data hosting services.
  16. Sensitive personal information isn’t about names, addresses, credit card numbers, or other direct identifying information. Research scientists should never, never, ever submit this type of information to any hosted service – ever. What we’re talking about is highly personal information (topics) within research data that may include past/present drug use, illegal activities, or perhaps sexual habits.
  17. We’re currently adding about 50 new agreements each month.
  18. Figshare: $8-$15 per month per individual though public space free; funded by Digital Science out of London (funded by MacMillan Publishers); accepts all files types from all disciplines; provides DOI; will accept data from any discipline Dryad: $80 per data package (individual) or bulk deposits from an institution for minimum $1,750 on up; also has member fees from $1,000 - $5,000, annually; focuses on data underlying international scientific and medical literature (replication); provides DOI DataShare: There are at least two – one out of University of Edinburgh & one out of UC San Francisco – focusing on UCSF. Currently funded by the California Digital Library & closed to UCSF; however it plans to open up to other institutions in future phases; really nice interface and great presentation of Why you should share data! http://datashare.ucsf.edu/xtf/search?smode=stepsPage DSpaceDirect: Product of Duraspace; $3750 to $8250 annually depending on the level of storage – targeted at institutional membership to sustain funding; set up so Google can discover content – search looks localized to the institution at this time (not yet a common catalog across DSpace entities); software provides DOI; will accept from any discipline Dataverse Network: free deposits; funded by Harvard; accepts all files; provides DOI; is opensource and has 8 other sites using DVN – likely they share a catalog Academic Torrents: distributed data repository where the focus is to accept really large (TBs) datasets; out of Umass Boston – has about 113 datsets as large as 84GBs; no fees found; looks like new entry attempting to offer a solution to big data!
  19. www.openicpsr.org
  20. FY 2014 683,204 datasets downloaded 38,924 active MyData accounts 457,449 website visits/300,198 unique visitors
  21. There is significant administrative burden required for the dissemination of restricted-use data. This includes the completion and review of restricted-use contracts that include IRB approval, data protection issues, placement of the data into the VDE and monitoring of progress and results with a disclosure review of results as well as server time. This is what the access fee to the data user will cover.
  22. openICPSR for Institutions and Journals was built to: Fulfill an organization’s governmental grant & journal replication requirements Brand the data-sharing service with your logo, colors, and a unique URL Provide DOIs & data citations upon publishing Increase exposure & reach of the organization’s research via inclusion in ICPSR’s data catalog & integration with your social media Administer the fully-hosted (cloud) service economically without the need for costly technical staff or equipment Share and preserve restricted-use data Provide confidence that the data and service are safe & available for the long term
  23. It is sometime easier to identify what openICPSR does not accept than what it does. openICPSR is not appropriate for the natural or hard sciences (bio-medical). It is also not appropriate for huge datasets – multiple GBs of data. Our meta-data experts and our catalog is focused on a very broadly defined area known as the social and behavioral sciences. For repositories outside ICPSR’s domain, see Stanford’s list: http://library.stanford.edu/research/data-management-services/share-and-preserve-research-data/domain-specific-data-repositories
  24. A collection of resources (links) to assist in data management plans for grant proposals Tools to prepare plans (templates & sample plans) Contact information for plan advice
  25. https://dmp.cdlib.org/ Puts together the basic structure & form for your DMP. Note that it isn’t plug and go – the reasoning behind your management plans based on the discipline and/or data being collected should be added.
  26. 22 pages of guidelines and references even including a sample plan (boilerplate!) available for download. Link to pdf document: http://www.icpsr.umich.edu/files/datamanagement/DataManagementPlans-All.pdf
  27. Pdf link to the data prep guide: http://www.icpsr.umich.edu/files/deposit/dataprep.pdf More information on data preparation for archiving: http://www.icpsr.umich.edu/icpsrweb/content/deposit/guide/