Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.
Adding Value through Data Curation
APLIC
April 29, 2015
San Diego, California
Libbie Stephenson, UCLA (libbie@ucla.edu)
Ja...
UCLA Social Science Data Archive
Established mid-1960’s
Small domain-specific archive of data for use in
quantitative rese...
• 50+ years of
experience
• Data stewardship
• Data management
• Data curation
• Data preservation
ICPSR
ICPSR Archives/Repositories
Recent Federal Data Sharing Initiatives
• NIH: 2003 – data sharing plans
• NSF: 2011 – data management plans
• OSTP: 2013 ...
https://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf
http://sites.nationalacademies.org/DBASSE/CurrentProjects/DBASSE_082378
http://guides.library.oregonstate.edu/federaloa
http://bit.ly/FedOASummary
Data Portion of Memo - 13 Elements
• The elements are also summarized online
within ICPSR’s Web site:
http://icpsr.umich.e...
“It saves funding and avoids
repeated data collecting efforts,
allows the verification and
replication of research finding...
UK results on data sharing attitudes
• In 2011 survey, 85% of researchers said they
thought their data would be of interes...
Data Sharing Status
Federal
Agency
Shared
Formally,
Archived
(n=111)
Shared
Informally,
Not
Archived
(n=415)
Not
Shared
(n...
Adding Value through Data Curation
1.Maximize access
2.Protect confidentiality and privacy
3.Appropriate attribution
4.Lon...
Maximize Access (Data Curation)
Discoverable
http://www.flickr.com/photos/papertrix/38028138/
Accessible
http://www.guardian.co.uk/science/grrlscientist/2012/mar/29/1
A well-prepared data collection
“contains information intended to
be complete and self-explanatory”
for future users.
Protect confidentiality and privacy
• It is critically important to protect the identities of research
subjects
• Disclosu...
Restricted-use data can be effectively
shared with the public
• ICPSR has been sharing restricted-use data for
over a deca...
Appropriate Attribution
• Properly citing data encourages the replication of
scientific results, improves research standar...
Long term preservation and sustainability
“Digital information lasts forever or five years,
whichever comes first”.
-Jeff Rothenberg
https://flic.kr/p/arHsh4
http://www.flickr.com/photos/blude/2665906010/
Data Management Planning
• Data management plans describe how researchers
will provide for long-term preservation of, and
...
ICPSR’s Data Management & Curation Site
http://www.icpsr.umich.edu/datamanagement/
Purpose of Data Management Plans
• Data management plans describe how researchers
will provide for long-term preservation ...
Elements of a Data Management
Plan
Element Description Recommended?
Data description Provide a brief description of the
in...
Elements of a Data Management
Plan
Element Description Recommended?
Data description Provide a brief description of the
in...
Elements of a Data Management
Plan
Element Description Recommended?
Data description Provide a brief description of the
in...
Elements of a Data Management
Plan
Element Description Recommended?
Access and
sharing
Indicate how you intend to archive
...
Elements of a Data Management
Plan
Element Description Recommended?
Access and
sharing
Indicate how you intend to archive
...
Elements of a Data Management
Plan
Element Description Recommended?
Access and
sharing
Indicate how you intend to archive
...
Elements of a Data Management
Plan
Element Description Recommended?
Access and
sharing
Indicate how you intend to archive
...
Elements of a Data Management
Plan
Element Description Recommended?
Access and
sharing
Indicate how you intend to archive
...
Elements of a Data Management
Plan
Element Description Recommended?
Access and
sharing
Indicate how you intend to archive
...
Elements of a Data Management
Plan
Element Description Recommended?
Access and
sharing
Indicate how you intend to archive
...
Elements of a Data Management
Plan
Element Description Recommended?
Metadata A description of the metadata to
be provided ...
Elements of a Data Management
Plan
Element Description Recommended?
Metadata A description of the metadata to
be provided ...
Elements of a Data Management
Plan
Element Description Recommended?
Metadata A description of the metadata to
be provided ...
Elements of a Data Management
Plan
Element Description Recommended?
Metadata A description of the metadata to
be provided ...
Elements of a Data Management
Plan
Element Description Recommended?
Metadata A description of the metadata to
be provided ...
Elements of a Data Management
Plan
Element Description Recommended?
Intellectual
property rights
Entities or persons who w...
Elements of a Data Management
Plan
Element Description Recommended?
Intellectual
property rights
Entities or persons who w...
Elements of a Data Management
Plan
Element Description Recommended?
Intellectual
property rights
Entities or persons who w...
Elements of a Data Management
Plan
Element Description Recommended?
Intellectual
property rights
Entities or persons who w...
Elements of a Data Management
Plan
Element Description Recommended?
Ethics and
privacy
A discussion of how informed consen...
Elements of a Data Management
Plan
Element Description Recommended?
Ethics and
privacy
A discussion of how informed consen...
Elements of a Data Management
Plan
Element Description Recommended?
Ethics and
privacy
A discussion of how informed consen...
Elements of a Data Management
Plan
Element Description Recommended?
Ethics and
privacy
A discussion of how informed consen...
Elements of a Data Management
Plan
Element Description Recommended?
Format Formats in which the data will be
generated, ma...
Elements of a Data Management
Plan
Element Description Recommended?
Format Formats in which the data will be
generated, ma...
Elements of a Data Management
Plan
Element Description Recommended?
Format Formats in which the data will be
generated, ma...
Elements of a Data Management
Plan
Element Description Recommended?
Archiving and
preservation
The procedures in place or ...
Elements of a Data Management
Plan
Element Description Recommended?
Archiving and
preservation
The procedures in place or ...
Elements of a Data Management
Plan
Element Description Recommended?
Archiving and
preservation
The procedures in place or ...
Elements of a Data Management
Plan
Element Description Recommended?
Storage and
backup
Storage methods and backup
procedur...
Elements of a Data Management
Plan
Element Description Recommended?
Storage and
backup
Storage methods and backup
procedur...
Elements of a Data Management
Plan
Element Description Recommended?
Existing data A survey of existing data relevant
to th...
Elements of a Data Management
Plan
Element Description Recommended?
Existing data A survey of existing data relevant
to th...
Elements of a Data Management
Plan
Element Description Recommended?
Data organization How the data will be managed
during ...
Elements of a Data Management
Plan
Element Description Recommended?
Quality
Assurance
Procedures for ensuring data
quality...
Elements of a Data Management
Plan
Element Description Recommended?
Quality
Assurance
Procedures for ensuring data
quality...
Elements of a Data Management
Plan
Element Description Recommended?
Security A description of technical and
procedural pro...
Elements of a Data Management
Plan
Element Description Recommended?
Responsibility Names of the individuals
responsible fo...
Elements of a Data Management
Plan
Element Description Recommended?
Budget The costs of preparing data and
documentation f...
Elements of a Data Management
Plan
Element Description Recommended?
Legal
requirements
A listing of all relevant federal o...
Elements of a Data Management
Plan
Element Description Recommended?
Audience Describe the audience of users
for the data.
...
Elements of a Data Management
Plan
Element Description Recommended?
Selection and
retention
periods
A description of how d...
Data Management
Plan Resources
Guidelines for Download
Data Curation Profiles Toolkit
http://datacurationprofiles.org/
DMP Template Tool
And still more guidelines after the
project is awarded:
• Guide emphasizes
preparation for data
sharing throughout
the pro...
http://www.data-archive.ac.uk/media/2894/managingsharing.pdf
http://www.dataone.org/sites/all/documents/DataONE_BP_Primer_020212.pdf
Involving ICPSR Further
 In addition to reviewing ICPSR’s website
materials, you may want to:
 Contact ICPSR to discuss ...
UCLA SSDA – What we know now
A Data Management Plan is not a static
document; it is a continuous process
Review plan over ...
UCLA SSDA – Where we are now
Training of librarians to do curation essential
Internships for students from information Stu...
What is quality data?
Data creation
Relevant
Accurate
Ethical
Complete
Timely
First Use
Understandable
Reuse
Independently...
Data Quality Review
From: Peer, Green and Stephenson, IDCC, February 2014
Libraries, Archives and Repositories –
Represent many different approaches
to data management
Who are the stakeholders?
Wh...
Standards and certification
Data Seal of Approval (DSA) – self-audit
ISO 16363/TDR – peer review
Trusted Repository Audit ...
Important to note:
Managing data is not
just about models and
technology.
Requires an
organizational
ecology.
Involves peo...
Considerations
People – still need to build workforce capacity
Basic data curation skills
Using metadata schemas, software...
Thank you!
libbie@ucla.edu
lyle@umich.edu
Public Data Sharing Services
UCLA SSDA – Curation Practices
Follow OAIS to appraise and ingest files
Data Quality Review
DDI Metadata
Data and metadata...
Adding valuethroughdatacuration
Adding valuethroughdatacuration
Adding valuethroughdatacuration
Próximo SlideShare
Cargando en…5
×

de

Adding valuethroughdatacuration Slide 1 Adding valuethroughdatacuration Slide 2 Adding valuethroughdatacuration Slide 3 Adding valuethroughdatacuration Slide 4 Adding valuethroughdatacuration Slide 5 Adding valuethroughdatacuration Slide 6 Adding valuethroughdatacuration Slide 7 Adding valuethroughdatacuration Slide 8 Adding valuethroughdatacuration Slide 9 Adding valuethroughdatacuration Slide 10 Adding valuethroughdatacuration Slide 11 Adding valuethroughdatacuration Slide 12 Adding valuethroughdatacuration Slide 13 Adding valuethroughdatacuration Slide 14 Adding valuethroughdatacuration Slide 15 Adding valuethroughdatacuration Slide 16 Adding valuethroughdatacuration Slide 17 Adding valuethroughdatacuration Slide 18 Adding valuethroughdatacuration Slide 19 Adding valuethroughdatacuration Slide 20 Adding valuethroughdatacuration Slide 21 Adding valuethroughdatacuration Slide 22 Adding valuethroughdatacuration Slide 23 Adding valuethroughdatacuration Slide 24 Adding valuethroughdatacuration Slide 25 Adding valuethroughdatacuration Slide 26 Adding valuethroughdatacuration Slide 27 Adding valuethroughdatacuration Slide 28 Adding valuethroughdatacuration Slide 29 Adding valuethroughdatacuration Slide 30 Adding valuethroughdatacuration Slide 31 Adding valuethroughdatacuration Slide 32 Adding valuethroughdatacuration Slide 33 Adding valuethroughdatacuration Slide 34 Adding valuethroughdatacuration Slide 35 Adding valuethroughdatacuration Slide 36 Adding valuethroughdatacuration Slide 37 Adding valuethroughdatacuration Slide 38 Adding valuethroughdatacuration Slide 39 Adding valuethroughdatacuration Slide 40 Adding valuethroughdatacuration Slide 41 Adding valuethroughdatacuration Slide 42 Adding valuethroughdatacuration Slide 43 Adding valuethroughdatacuration Slide 44 Adding valuethroughdatacuration Slide 45 Adding valuethroughdatacuration Slide 46 Adding valuethroughdatacuration Slide 47 Adding valuethroughdatacuration Slide 48 Adding valuethroughdatacuration Slide 49 Adding valuethroughdatacuration Slide 50 Adding valuethroughdatacuration Slide 51 Adding valuethroughdatacuration Slide 52 Adding valuethroughdatacuration Slide 53 Adding valuethroughdatacuration Slide 54 Adding valuethroughdatacuration Slide 55 Adding valuethroughdatacuration Slide 56 Adding valuethroughdatacuration Slide 57 Adding valuethroughdatacuration Slide 58 Adding valuethroughdatacuration Slide 59 Adding valuethroughdatacuration Slide 60 Adding valuethroughdatacuration Slide 61 Adding valuethroughdatacuration Slide 62 Adding valuethroughdatacuration Slide 63 Adding valuethroughdatacuration Slide 64 Adding valuethroughdatacuration Slide 65 Adding valuethroughdatacuration Slide 66 Adding valuethroughdatacuration Slide 67 Adding valuethroughdatacuration Slide 68 Adding valuethroughdatacuration Slide 69 Adding valuethroughdatacuration Slide 70 Adding valuethroughdatacuration Slide 71 Adding valuethroughdatacuration Slide 72 Adding valuethroughdatacuration Slide 73 Adding valuethroughdatacuration Slide 74 Adding valuethroughdatacuration Slide 75 Adding valuethroughdatacuration Slide 76 Adding valuethroughdatacuration Slide 77 Adding valuethroughdatacuration Slide 78 Adding valuethroughdatacuration Slide 79 Adding valuethroughdatacuration Slide 80 Adding valuethroughdatacuration Slide 81 Adding valuethroughdatacuration Slide 82 Adding valuethroughdatacuration Slide 83 Adding valuethroughdatacuration Slide 84 Adding valuethroughdatacuration Slide 85 Adding valuethroughdatacuration Slide 86 Adding valuethroughdatacuration Slide 87 Adding valuethroughdatacuration Slide 88 Adding valuethroughdatacuration Slide 89 Adding valuethroughdatacuration Slide 90 Adding valuethroughdatacuration Slide 91 Adding valuethroughdatacuration Slide 92
Próximo SlideShare
Ijirsm amrutha-s-efficient-complaint-registration-to-government-bodies
Siguiente
Descargar para leer sin conexión y ver en pantalla completa.

0 recomendaciones

Compartir

Descargar para leer sin conexión

Adding valuethroughdatacuration

Descargar para leer sin conexión

Presenters : Libbie Stephenson, Jared Lyle
This session discusses the value of and methods for curating data, especially in light of recent government and academic initiatives. Special attention will be paid to data management plans.

  • Sé el primero en recomendar esto

Adding valuethroughdatacuration

  1. 1. Adding Value through Data Curation APLIC April 29, 2015 San Diego, California Libbie Stephenson, UCLA (libbie@ucla.edu) Jared Lyle, ICPSR (lyle@umich.edu)
  2. 2. UCLA Social Science Data Archive Established mid-1960’s Small domain-specific archive of data for use in quantitative research Surveys, enumerations, public opinion polls, administrative records Three full time staff; part time student interns Holdings are partly files deposited by faculty Services include: Reference, statistical, data management plans, research project support
  3. 3. • 50+ years of experience • Data stewardship • Data management • Data curation • Data preservation ICPSR
  4. 4. ICPSR Archives/Repositories
  5. 5. Recent Federal Data Sharing Initiatives • NIH: 2003 – data sharing plans • NSF: 2011 – data management plans • OSTP: 2013 – Memo with subject “Increasing Access to the Results of Federally Funded Scientific Research”
  6. 6. https://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf
  7. 7. http://sites.nationalacademies.org/DBASSE/CurrentProjects/DBASSE_082378
  8. 8. http://guides.library.oregonstate.edu/federaloa
  9. 9. http://bit.ly/FedOASummary
  10. 10. Data Portion of Memo - 13 Elements • The elements are also summarized online within ICPSR’s Web site: http://icpsr.umich.edu/content/datamanagement/ostp.html
  11. 11. “It saves funding and avoids repeated data collecting efforts, allows the verification and replication of research findings, facilitates scientific openness, deters scientific misconduct, and supports communication and progress.” Niu (2006). “Reward and Punishment Mechanism for Research Data Sharing.” http://www.iassistdata.org/downloads/iqvol304niu.pdf
  12. 12. UK results on data sharing attitudes • In 2011 survey, 85% of researchers said they thought their data would be of interest to others. • Only 41% said they would be happy to make their data available. • Only a third had previously published data. Source: DaMaRO Project, University of Oxford http://www.slideshare.net/DigCurv/15-meriel-patrick
  13. 13. Data Sharing Status Federal Agency Shared Formally, Archived (n=111) Shared Informally, Not Archived (n=415) Not Shared (n=409) NSF (27.3%) 22.4% 43.7% 33.9% NIH (72.7%) 7.4% 45.0% 47.6% Total 11.5% 44.6% 43.9% Pienta, Gutmann, & Lyle (2009). “Research Data in The Social Sciences: How Much is Being Shared?” http://ori.hhs.gov/content/research-research-integrity-rri-conference-2009 See also: Pienta, Gutmann, Hoelter, Lyle, & Donakowski (2008). “The LEADS Database at ICPSR: Identifying Important ‘At Risk’ Social Science Data.” http://www.data-pass.org/sites/default/files/Pienta_et_al_2008.pdf Pienta, Alter, & Lyle (2010). “The Enduring Value of Social Science Research: The Use and Reuse of Primary Research Data”. http://hdl.handle.net/2027.42/78307
  14. 14. Adding Value through Data Curation 1.Maximize access 2.Protect confidentiality and privacy 3.Appropriate attribution 4.Long term preservation and sustainability 5.Data management planning
  15. 15. Maximize Access (Data Curation)
  16. 16. Discoverable http://www.flickr.com/photos/papertrix/38028138/
  17. 17. Accessible http://www.guardian.co.uk/science/grrlscientist/2012/mar/29/1
  18. 18. A well-prepared data collection “contains information intended to be complete and self-explanatory” for future users.
  19. 19. Protect confidentiality and privacy • It is critically important to protect the identities of research subjects • Disclosure risk is a term that is often used for the possibility that a data record from a study could be linked to a specific person • Data with these risks can be shared via a secured virtual environment • Data concerning very sensitive topics can also be shared via a secured environment
  20. 20. Restricted-use data can be effectively shared with the public • ICPSR has been sharing restricted-use data for over a decade. Three methods are used: – Secure Download – Virtual Data Enclave – Physical Enclave • ICPSR stores & shares over 6,400 restricted- use datasets associated with over 2,000 ‘active’ restricted-use data agreements
  21. 21. Appropriate Attribution • Properly citing data encourages the replication of scientific results, improves research standards, guarantees persistent reference, and gives proper credit to data producers. • Citing data is straightforward. Each citation must include the basic elements that allow a unique dataset to be identified over time: title, author, date, version, and persistent identifier (e.g., DOI). • Resources: ICPSR's Data Citations page , IASSIST's Quick Guide to Data Citation, DataCite.
  22. 22. Long term preservation and sustainability
  23. 23. “Digital information lasts forever or five years, whichever comes first”. -Jeff Rothenberg
  24. 24. https://flic.kr/p/arHsh4
  25. 25. http://www.flickr.com/photos/blude/2665906010/
  26. 26. Data Management Planning • Data management plans describe how researchers will provide for long-term preservation of, and access to, scientific data in digital formats. • Data management plans provide opportunities for researchers to manage and curate their data more actively from project inception to completion.
  27. 27. ICPSR’s Data Management & Curation Site http://www.icpsr.umich.edu/datamanagement/
  28. 28. Purpose of Data Management Plans • Data management plans describe how researchers will provide for long-term preservation of, and access to, scientific data in digital formats. • Data management plans provide opportunities for researchers to manage and curate their data more actively from project inception to completion.
  29. 29. Elements of a Data Management Plan Element Description Recommended? Data description Provide a brief description of the information to be gathered; the nature, scope and scale of the data that will be generated or collected. Highly recommended. Generic Example 1: This project will produce public-use nationally representative survey data for the United States covering Americans' social backgrounds, enduring political predispositions, social and political values, perceptions and evaluations of groups and candidates, opinions on questions of public policy, and participation in political life.
  30. 30. Elements of a Data Management Plan Element Description Recommended? Data description Provide a brief description of the information to be gathered; the nature, scope and scale of the data that will be generated or collected. Highly recommended. Generic Example 2: This project will generate data designed to study the prevalence and correlates of DSM III-R psychiatric disorders and patterns and correlates of service utilization for these disorders in a nationally representative sample of over 8000 respondents. The sensitive nature of these data will require that the data be released through a restricted use contract.
  31. 31. Elements of a Data Management Plan Element Description Recommended? Data description Provide a brief description of the information to be gathered; the nature, scope and scale of the data that will be generated or collected. Highly recommended. ICPSR: [Provide a brief description of the information to be gathered -- the nature, scope, and scale of the data that will be generated or collected.] These data, which will be submitted to ICPSR, fit within the scope of the ICPSR Collection Development Policy. A letter of support describing ICPSR's commitment to the data as they have been described is provided.
  32. 32. Elements of a Data Management Plan Element Description Recommended? Access and sharing Indicate how you intend to archive and share your data and why you have chosen that particular option. Highly recommended. Generic Example 1: The research data from this project will be deposited with [repository] to ensure that the research community has long-term access to the data.
  33. 33. Elements of a Data Management Plan Element Description Recommended? Access and sharing Indicate how you intend to archive and share your data and why you have chosen that particular option. Highly recommended. Generic Example 2: The project team will create a dedicated Web site to manage and distribute the data because the audience for the data is small and has a tradition of interacting as a community. The site will be established using a content management system like Drupal or Joomla so that data users can participate in adding site content over time, making the site self-sustaining. The site will be available at a .org location. For preservation, we will supply periodic copies of the data to [repository]. That repository will be the ultimate home for the data.
  34. 34. Elements of a Data Management Plan Element Description Recommended? Access and sharing Indicate how you intend to archive and share your data and why you have chosen that particular option. Highly recommended. Generic Example 3: The research data from this project will be deposited with [repository] to ensure that the research community has long-term access to the data. The data will be under embargo for one year while the investigators complete their analyses.
  35. 35. Elements of a Data Management Plan Element Description Recommended? Access and sharing Indicate how you intend to archive and share your data and why you have chosen that particular option. Highly recommended. Generic Example 4: The research data from this project will be deposited with the institutional repository on the grantees' campus.
  36. 36. Elements of a Data Management Plan Element Description Recommended? Access and sharing Indicate how you intend to archive and share your data and why you have chosen that particular option. Highly recommended. ICPSR: The research data from this project will be deposited with the digital repository of the Inter-university Consortium for Political and Social Research (ICPSR) to ensure that the research community has long-term access to the data. The integrated data management plan proposed leverages capabilities of ICPSR and its trained archival staff.
  37. 37. Elements of a Data Management Plan Element Description Recommended? Access and sharing Indicate how you intend to archive and share your data and why you have chosen that particular option. Highly recommended. ICPSR: ICPSR will make the research data from this project available to the broader social science research community. Public-use data files: These files, in which direct and indirect identifiers have been removed to minimize disclosure risk, may be accessed directly through the ICPSR Web site. After agreeing to Terms of Use, users with an ICPSR MyData account and an authorized IP address from a member institution may download the data, and non-members may purchase the files. Restricted-use data files: These files are distributed in those cases when removing potentially identifying information would significantly impair the analytic potential of the data. Users (and their institutions) must apply for these files, create data security plans, and agree to other access controls.
  38. 38. Elements of a Data Management Plan Element Description Recommended? Access and sharing Indicate how you intend to archive and share your data and why you have chosen that particular option. Highly recommended. ICPSR (continued): Timeliness: The research data from this project will be supplied to ICPSR before the end of the project so that any issues surrounding the usability of the data can be resolved. Delayed dissemination may be possible. The Delayed Dissemination Policy allows for data to be deposited but not disseminated for an agreed-upon period of time (typically one year).
  39. 39. Elements of a Data Management Plan Element Description Recommended? Metadata A description of the metadata to be provided along with the generated data, and a discussion of the metadata standards used. Highly recommended. Generic Example 1: Metadata will be tagged in XML using the Data Documentation Initiative (DDI) format. The codebook will contain information on study design, sampling methodology, fieldwork, variable-level detail, and all information necessary for a secondary analyst to use the data accurately and effectively.
  40. 40. Elements of a Data Management Plan Element Description Recommended? Metadata A description of the metadata to be provided along with the generated data, and a discussion of the metadata standards used. Highly recommended. Generic Example 2: The clinical data collected from this project will be documented using CDISC metadata standards.
  41. 41. Elements of a Data Management Plan Element Description Recommended? Metadata A description of the metadata to be provided along with the generated data, and a discussion of the metadata standards used. Highly recommended. ICPSR: Substantive metadata will be provided in compliance with the most relevant standard for the social, behavioral, and economic sciences -- the Data Documentation Initiative (DDI). This XML standard provides for the tagging of content, which facilitates preservation and enables flexibility in display. These types of metadata will be produced and archived: Study-Level Metadata Record. A summary DDI-based record will be created for inclusion in the searchable ICPSR online catalog. This record will be indexed with terms from the ICPSR Thesaurus to enhance data discovery.
  42. 42. Elements of a Data Management Plan Element Description Recommended? Metadata A description of the metadata to be provided along with the generated data, and a discussion of the metadata standards used. Highly recommended. ICPSR (continued): Data Citation with Digital Object Identifier (DOI). A standard citation will be provided to facilitate attribution. The DOI provides permanent identification for data & ensures that they will always be found at the URL. Variable-Level Documentation. ICPSR will tag variable-level information in DDI format for inclusion in ICPSR's Social Science Variables Database (SSVD), which allows users to identify relevant variables &studies of interest.
  43. 43. Elements of a Data Management Plan Element Description Recommended? Metadata A description of the metadata to be provided along with the generated data, and a discussion of the metadata standards used. Highly recommended. ICPSR (continued): Technical Documentation. The variable-level files described above will serve as the foundation for the technical documentation or codebook that ICPSR will prepare and deliver. Related Publications. Resources permitting, ICPSR will periodically search for publications based on the data and provide two-way linkages between data and publications.
  44. 44. Elements of a Data Management Plan Element Description Recommended? Intellectual property rights Entities or persons who will hold the intellectual property rights to the data, and how IP will be protected if necessary. Any copyright constraints (e.g., copyrighted data collection instruments) should be noted. Highly recommended. Generic Example 1: The principal investigators on the project and their institutions will hold the copyright for the research data they generate.
  45. 45. Elements of a Data Management Plan Element Description Recommended? Intellectual property rights Entities or persons who will hold the intellectual property rights to the data, and how IP will be protected if necessary. Any copyright constraints (e.g., copyrighted data collection instruments) should be noted. Highly recommended. Generic Example 2: The principal investigators on the project and their institutions will hold the copyright for the research data they generate but will grant redistribution rights to [repository] for purposes of data sharing.
  46. 46. Elements of a Data Management Plan Element Description Recommended? Intellectual property rights Entities or persons who will hold the intellectual property rights to the data, and how IP will be protected if necessary. Any copyright constraints (e.g., copyrighted data collection instruments) should be noted. Highly recommended. Generic Example 3: The data gathered will use a copyrighted instrument for some questions. A reproduction of the instrument will be provided to [repository] as documentation for the data deposited with the intention that the instrument be distributed under "fair use" to permit data sharing, but it may not be redisseminated by users.
  47. 47. Elements of a Data Management Plan Element Description Recommended? Intellectual property rights Entities or persons who will hold the intellectual property rights to the data, and how IP will be protected if necessary. Any copyright constraints (e.g., copyrighted data collection instruments) should be noted. Highly recommended. ICPSR: Principal investigators and their institutions hold the copyright for the research data they generate. By depositing with ICPSR, investigators do not transfer copyright but instead grant permission for ICPSR to redisseminate the data and to transform the data as necessary to protect respondent confidentiality, improve usefulness, and facilitate preservation.
  48. 48. Elements of a Data Management Plan Element Description Recommended? Ethics and privacy A discussion of how informed consent will be handled and how privacy will be protected, including any exceptional arrangements that might be needed to protect participant confidentiality, and other ethical issues that may arise. Highly recommended. Generic Example 1: For this project, informed consent statements will use language that will not prohibit the data from being shared with the research community.
  49. 49. Elements of a Data Management Plan Element Description Recommended? Ethics and privacy A discussion of how informed consent will be handled and how privacy will be protected, including any exceptional arrangements that might be needed to protect participant confidentiality, and other ethical issues that may arise. Highly recommended. Generic Example 2: The following language will be used in the informed consent: The information in this study will only be used in ways that will not reveal who you are. You will not be identified in any publication from this study or in any data files shared with other researchers. Your participation in this study is confidential. Federal or state laws may require us to show information to university or government officials [or sponsors], who are responsible for monitoring the safety of this study.
  50. 50. Elements of a Data Management Plan Element Description Recommended? Ethics and privacy A discussion of how informed consent will be handled and how privacy will be protected, including any exceptional arrangements that might be needed to protect participant confidentiality, and other ethical issues that may arise. Highly recommended. Generic Example 3: The proposed medical records research falls under the HIPAA Privacy Rule. Consequently, the investigators will provide documentation that an alteration or waiver of research participants' authorization for use/disclosure of information about them for research purposes has been approved by an IRB or a Privacy Board.
  51. 51. Elements of a Data Management Plan Element Description Recommended? Ethics and privacy A discussion of how informed consent will be handled and how privacy will be protected, including any exceptional arrangements that might be needed to protect participant confidentiality, and other ethical issues that may arise. Highly recommended. ICPSR: Informed consent: For this project, informed consent statements, if applicable, will not include language that would prohibit the data from being shared with the research community. Disclosure risk management: The research project will remove any direct identifiers in the data before deposit with ICPSR. Once deposited, the data will undergo procedures to protect the confidentiality of individuals whose personal information may be part of archived data. These include: (1) rigorous review to assess disclosure risk, (2) modifying data if necessary to protect confidentiality, (3) limiting access to datasets in which risk of disclosure remains high, and (4) consultation with data producers to manage disclosure risk. ICPSR will assign a qualified data manager certified in disclosure risk management to act as steward for the data while they are being processed. The data will be processed and managed in a secure non-networked
  52. 52. Elements of a Data Management Plan Element Description Recommended? Format Formats in which the data will be generated, maintained, and made available, including a justification for the procedural and archival appropriateness of those formats. Highly recommended. Generic Example 1: Quantitative survey data files generated will be processed and submitted to the [repository] as SPSS system files with DDI XML documentation. The data will be distributed in several widely used formats, including ASCII, tab-delimited (for use with Excel), SAS, SPSS, and Stata. Documentation will be provided as PDF. Data will be stored as ASCII along with setup files for the statistical software packages. Documentation will be preserved using XML and PDF/A.
  53. 53. Elements of a Data Management Plan Element Description Recommended? Format Formats in which the data will be generated, maintained, and made available, including a justification for the procedural and archival appropriateness of those formats. Highly recommended. Generic Example 2: Digital video data files generated will be processed and submitted to the [repository] in MPEG-4 (.mp4) format.
  54. 54. Elements of a Data Management Plan Element Description Recommended? Format Formats in which the data will be generated, maintained, and made available, including a justification for the procedural and archival appropriateness of those formats. Highly recommended. ICPSR: Submission: The data and documentation will be submitted to ICPSR in recommended formats. Access: ICPSR will make the quantitative data files available in several widely used formats, including ASCII, tab-delimited (for use with Excel), SAS, SPSS, and Stata. Documentation will be provided as PDF. Preservation: Data will be stored in accordance with prevailing standards and practice. Currently, ICPSR stores quantitative data as ASCII along with setup files for the statistical software packages, and documentation is preserved using XML and PDF/A.
  55. 55. Elements of a Data Management Plan Element Description Recommended? Archiving and preservation The procedures in place or envisioned for long-term archiving and preservation of the data, including succession plans for the data should the expected archiving entity go out of existence. Highly recommended. Generic Example 1: By depositing data with [repository], our project will ensure that the research data are migrated to new formats, platforms, and storage media as required by good practice.
  56. 56. Elements of a Data Management Plan Element Description Recommended? Archiving and preservation The procedures in place or envisioned for long-term archiving and preservation of the data, including succession plans for the data should the expected archiving entity go out of existence. Highly recommended. Generic Example 2: In addition to distributing the data from a project Web site, future long-term use of the data will be ensured by placing a copy of the data into [repository], ensuring that best practices in digital preservation will safeguard the files.
  57. 57. Elements of a Data Management Plan Element Description Recommended? Archiving and preservation The procedures in place or envisioned for long-term archiving and preservation of the data, including succession plans for the data should the expected archiving entity go out of existence. Highly recommended. ICPSR: ICPSR is a data archive with a nearly 50-year track record for preserving and making data available over several generational shifts in technology. ICPSR will accept responsibility for long-term preservation of the research data upon receipt of a signed deposit form. This responsibility includes a commitment to manage successive iterations of the data if new waves or versions are deposited. ICPSR will ensure that the research data are migrated to new formats, platforms, and storage media as required by good practice in the digital preservation community. Good practice for digital preservation requires that an organization address succession planning for digital assets. ICPSR has a commitment to designate a successor in the unlikely event that such a need arises.
  58. 58. Elements of a Data Management Plan Element Description Recommended? Storage and backup Storage methods and backup procedures for the data, including the physical and cyber resources and facilities that will be used for the effective preservation and storage of the research data. Highly recommended. Generic Example 1: [Repository] will place a master copy of each digital file (i.e., research data files, documentation, and other related files) in Archival Storage, with several copies stored at designated locations and synchronized with the master through the Storage Resource Broker.
  59. 59. Elements of a Data Management Plan Element Description Recommended? Storage and backup Storage methods and backup procedures for the data, including the physical and cyber resources and facilities that will be used for the effective preservation and storage of the research data. Highly recommended. ICPSR: Research has shown that multiple locally and geographically distributed copies of digital files are required to keep information safe. Accordingly, ICPSR will place a master copy of each digital file (i.e., research data files, documentation, and other related files) in ICPSR's Archival Storage, with several copies stored with partner organizations at designated locations and synchronized with the master.
  60. 60. Elements of a Data Management Plan Element Description Recommended? Existing data A survey of existing data relevant to the project and a discussion of whether and how these data will be integrated. Generic Example 1: Few datasets exist that focus on this population in the United States and how their attitudes toward assimilation differ from those of others. The primary resource on this population, [give dataset title here], is inadequate because...
  61. 61. Elements of a Data Management Plan Element Description Recommended? Existing data A survey of existing data relevant to the project and a discussion of whether and how these data will be integrated. Generic Example 2: Data have been collected on this topic previously (for example: [add example(s)]). The data collected as part of this project reflect the current time period and historical context. It is possible that several of these datasets, including the data collected here, could be combined to better understand how social processes have unfolded over time.
  62. 62. Elements of a Data Management Plan Element Description Recommended? Data organization How the data will be managed during the project, with information about version control, naming conventions, etc. Generic Example 1: Data will be stored in a CVS system and checked in and out for purposes of versioning. Variables will use a standardized naming convention consisting of a prefix, root, suffix system. Separate files will be managed for the two kinds of records produced: one file for respondents and another file for children with merging routines specified.
  63. 63. Elements of a Data Management Plan Element Description Recommended? Quality Assurance Procedures for ensuring data quality during the project. Example 1: Quality assurance measures will comply with the standards, guidelines, and procedures established by the World Health Organization.
  64. 64. Elements of a Data Management Plan Element Description Recommended? Quality Assurance Procedures for ensuring data quality during the project. Example 2: For quantitative data files, the [repository] ensures that missing data codes are defined, that actual data values fall within the range of expected values and that the data are free from wild codes. Processed data files are reviewed by a supervisory staff member before release.
  65. 65. Elements of a Data Management Plan Element Description Recommended? Security A description of technical and procedural protections for information, including confidential information, and how permissions, restrictions, and embargoes will be enforced. Example 1: The data will be processed and managed in a secure non-networked environment using virtual desktop technology.
  66. 66. Elements of a Data Management Plan Element Description Recommended? Responsibility Names of the individuals responsible for data management in the research project. Example 1: The project will assign a qualified data manager certified in disclosure risk management to act as steward for the data while they are being collected, processed, and analyzed.
  67. 67. Elements of a Data Management Plan Element Description Recommended? Budget The costs of preparing data and documentation for archiving and how these costs will be paid. Requests for funding may be included. Example 1: Staff time has been allocated in the proposed budget to cover the costs of preparing data and documentation for archiving. The [repository] has estimated their additional cost to archive the data is [insert dollar amount]. This fee appears in the budget for this application as well.
  68. 68. Elements of a Data Management Plan Element Description Recommended? Legal requirements A listing of all relevant federal or funder requirements for data management and data sharing. Example 1: The proposed medical records research falls under the HIPAA Privacy Rule. Consequently, the investigators will provide documentation that an alteration or waiver of research participants' authorization for use/disclosure of information about them for research purposes has been approved by an IRB or a Privacy Board.
  69. 69. Elements of a Data Management Plan Element Description Recommended? Audience Describe the audience of users for the data. Example 1: The data to be produced will be of interest to demographers studying family formation practices in early adulthood across different racial and ethnic groups. In addition to the research community, we expect these data will be used by practioners and policymakers.
  70. 70. Elements of a Data Management Plan Element Description Recommended? Selection and retention periods A description of how data will be selected for archiving, how long the data will be held, and plans for eventual transition or termination of the data collection in the future. Example 1: Our project will generate a large volume of data, some of which may not be appropriate for sharing since it involves a small sample that is not representative. The investigators will work with staff of the [repository] to determine what to archive and how long the deposited data should be retained.
  71. 71. Data Management Plan Resources
  72. 72. Guidelines for Download
  73. 73. Data Curation Profiles Toolkit http://datacurationprofiles.org/
  74. 74. DMP Template Tool
  75. 75. And still more guidelines after the project is awarded: • Guide emphasizes preparation for data sharing throughout the project • Available online and via download (pdf)
  76. 76. http://www.data-archive.ac.uk/media/2894/managingsharing.pdf
  77. 77. http://www.dataone.org/sites/all/documents/DataONE_BP_Primer_020212.pdf
  78. 78. Involving ICPSR Further  In addition to reviewing ICPSR’s website materials, you may want to:  Contact ICPSR to discuss whether a future data collection fits within the ICPSR collection. Note: The earlier the better!  Request a Letter of Support for a grant application from ICPSR indicating support for archiving the data with ICPSR. Note: The earlier the better!  Determine if there are any costs to archiving the data with ICPSR. Note: The earlier the better!
  79. 79. UCLA SSDA – What we know now A Data Management Plan is not a static document; it is a continuous process Review plan over time – update, revise, prioritize next steps Focus on meeting needs of managing during and after project – for researcher, team and future users
  80. 80. UCLA SSDA – Where we are now Training of librarians to do curation essential Internships for students from information Studies Redesigned workflow for appraisal and ingest More focus on data quality review Increase use of tools to create metadata
  81. 81. What is quality data? Data creation Relevant Accurate Ethical Complete Timely First Use Understandable Reuse Independently understandable Findable / Shared / Open/ Public Preserved From: Peer, Green and Stephenson, IDCC, February 2014
  82. 82. Data Quality Review From: Peer, Green and Stephenson, IDCC, February 2014
  83. 83. Libraries, Archives and Repositories – Represent many different approaches to data management Who are the stakeholders? What are their roles and responsibilities? What are the policies for collection and archiving? Goals for long term usability? How do the policies govern the infrastructure? Consider the type/format/discipline, organizational environment, mission, staff competencies, technical capabilities, finances
  84. 84. Standards and certification Data Seal of Approval (DSA) – self-audit ISO 16363/TDR – peer review Trusted Repository Audit Certification (TRAC) – complete external audit Digital Repository Audit Method Based on Risk Assessment (DRAMBORA) – carry out periodically
  85. 85. Important to note: Managing data is not just about models and technology. Requires an organizational ecology. Involves people, processes and policies. http://web.stanford.edu/group/dlss/pasig/PASIG_September2014/20 140916_Presentations/20140916_03_System_Architecture_for_Dig ital_Preservation_Neil_Jefferies.pdf
  86. 86. Considerations People – still need to build workforce capacity Basic data curation skills Using metadata schemas, software, sysadmin External influences and pressures Funder requirements New kinds of data Systems will need to evolve – will become obsolete How will all this be financially sustained? http://web.stanford.edu/group/dlss/pasig/PASIG_September2014/2014 0916_Presentations/20140916_03_System_Architecture_for_Digital_ Preservation_Neil_Jefferies.pdf
  87. 87. Thank you! libbie@ucla.edu lyle@umich.edu
  88. 88. Public Data Sharing Services
  89. 89. UCLA SSDA – Curation Practices Follow OAIS to appraise and ingest files Data Quality Review DDI Metadata Data and metadata processed in Colectica Carry out media format migration when necessary Process for use with SDA DataPASS deposit

Presenters : Libbie Stephenson, Jared Lyle This session discusses the value of and methods for curating data, especially in light of recent government and academic initiatives. Special attention will be paid to data management plans.

Vistas

Total de vistas

1.175

En Slideshare

0

De embebidos

0

Número de embebidos

668

Acciones

Descargas

4

Compartidos

0

Comentarios

0

Me gusta

0

×