New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Needs for Data Management & Citation Throughout the Information Lifecycle
1. Prepared for
NISO Forum:
Tracking it Back to the Source: Managing and Citing Research Data
September 2012
Needs for Data Management &
Citation Throughout the Information
Lifecycle
Micah Altman
Director of Research, MIT Libraries
2. Collaborators and Co-Conspirators
• Jonathan Crabtree, Merce Crosas, Gary King, Tom
Lipkis, Nancy McGovern, John Willinsky
• Research Support
– Library of Congress (PA#NDP03-1),
– National Science Foundation (DMS-0835500, SES 0112072)
– Institute for Museum and Library Services (LG-05-09-0041-09)
– Sloan Foundation
– Amazon Web Services
– Massachusetts Institute of Technology
Needs for Data Management & Citation 2
3. Related Work
Reprints available from:
http://maltman.hmdc.harvard.edu
• Altman, M. 2012. Data Citation in The Dataverse Network ®. In P. F. Uhlir (Ed.), Developing Data
Attribution and Citation Practices and Standards: Report from an International Workshop (p.
Forthcoming). National Academies Press. Forthcoming.
• Altman, M., & Crabtree, J. 2011. Using the SafeArchive System : TRAC-Based Auditing of LOCKSS.
Archiving 2011 (pp. 165–170). Society for Imaging Science and Technology.
• M. Altman, Adams, M., Crabtree, J., Donakowski, D., Maynard, M., Pienta, A., & Young, C. 2009.
"Digital preservation through archival collaboration: The Data Preservation Alliance for the Social
Sciences." The American Archivist. 72(1): 169-182 M. Altman, 2008, "A Fingerprint Method for
Verification of Scientific Data" in, Advances in Systems, Computing Sciences and Software
Engineering, (Proceedings of the International Conference on Systems, Computing Sciences and
Software Engineering 2007) , Springer-Verlag.
• M. Altman and G. King. 2007. “A Proposed Standard for the Scholarly Citation of Quantitative Data”,
D-Lib, 13, 3/4 (March/April).
Needs for Data Management & Citation 3
4. Preview
• Principled approach to data management
• Lifecycle data management planning
• Lifecycle data management tracking
• Lifecycle data management infrastructure
• [Exemplar Projects]
Needs for Data Management & Citation 4
6. “Data science is suddenly sexy –
does that mean data is the new
black?”
Needs for Data Management & Citation 6
7. Valuable Data is Lost
• Researchers lack Examples
archiving capability Intentionally Discarded: “Destroyed, in accord with
[nonexistent] APA 5-year post-publication rule.”
• Incentives for data Unintentional Hardware Problems “Some data were
sharing are weak collected, but the data file was lost in a technical
malfunction.”
Acts of Nature The data from the studies were on punched
cards that were destroyed in a flood in the department
in the early 80s.”
Discarded or Lost in a Move “As I retired ….
Unfortunately, I simply didn’t have the room to store
these data sets at my house.”
Obsolescence “Speech recordings stored on a LISP
Machine…, an experimental computer which is long
obsolete.”
Simply Lost “For all I know, they are on a [University]
server, but it has been literally years and years since
the research was done, and my files are long gone.”
Research by:
Needs for Data Management & Citation 7
8. Unpublished Data Ends up in the “Desk Drawer”
• Null results are less likely to be published
• Outliers are routinely discarded
Daniel
Schectman’s
Lab Notebook
Providing
Initial
Evidence of
Quasi Crystals
Needs for Data Management & Citation 8
9. Data Behind Publications Unavailable for
Review, Reuse, Replication
Needs for Data Management & Citation 9
10. Model Science
“Citations to unpublished data and personal
communications cannot be used to support
claims in a published paper”
“All data necessary to understand, assess,
and extend the conclusions of the
manuscript must be available to any reader
of Science.”
Needs for Data Management & Citation 10
11. Compliance with Policies is Low
Compliance is low even in
best examples of journals
Checking compliance
manually is tedious,
doesn’t scale
Needs for Data Management & Citation 11
12. Special Challenges for Long-Term Access
to New Forms of Data
• Some Examples
– GIS and geospatial trails
– Facebook & social networks
– Text: blogs, tweets
– Cell phone data
• Challenges
– Proprietary – intellectual Source: [Calberese 2008]
property
– Size
– Dynamic content
– Fixity
– Format Needs for Data Management & Citation 12
14. “The published article is not scientific output
–
it’s a summary of scientific output.”
-- corollary of Buckheit & Donaho 1995
Needs for Data Management & Citation 14
15. Information Lifecycle
Long-term Creation/Collecti
access on
Modeling
Re-use
• Scientific Storage/I
• Educational ngest
• Scientometric
• Institutional
External
dissemination/publicati Processing
on
Internal
Analysis
Sharing
Needs for Data Management & Citation 15
16. Stakeholders
Data
Consumers Long- Sources/Su
Creation/C bjects
term
ollection
access
Data
Modeling
Archives/ Storage/
Publisher Re-use
Researchers Ingest
Research Research
Sponsors Organizations
External
dissemination/ Processing
publication
Scholarly Internal
Analysis
Publishers Sharing
Service/Infras
tructure
Needs for Data Management & Citation
Providers 16
17. Legal Requirements and Rights
Contract Intellectual Property
Trade
Secret Intellectual
Contract Click-Wrap Patent
Attribution
TOU
License Moral Rights
Modeling
Database Rights
Journal Funder Open Copyright DMCA Trademar
Replication Access k
Requirement Fair Use Rights of
Common
s Publicity
Rule
HIPAA 45 CFR 26 Privacy
FOIA EU Privacy
FERPA Torts
Directive (Invasion,
State Defamation)
FOI CIPSEA
Potentially
Laws State Harmful
Privacy Laws (Archeologic
al Sites,
Classifie
Sensitive Animal
butd Testing, …)
Access EA Confidentiality
Unclassifie
Rights d R
ITAR
18. Stakeholders, Rights and Requirements
Contract Intellectual Property
Trade
Secret Intellectual
Contract Click-Wrap Scholarly Patent
Publisher Attribution
TOU
License s Moral Rights
Modeling
Consumers
- Secondary research
- Participative Science
- - Public policy uses
Database Rights
Journal Funder Open Copyright
Infrastructure/Serv DMCA Trademar
Replication Access Primary
ice Providers k
Requirement Fair Use Rights of
Researchers
Common
s Publicity
Research HIPAA
HIPAA Rule
FOIA Organizations 45 CFR 26 Privacy
EU Privacy Torts
FERPA FERPA
Directive (Invasion,
State Data Archives CIPSEA Defamation)
FOI Laws State Potentially
Privacy Laws Harmful
Classifie (Archeologic
Research al Sites,
Sponsors Sensitive Sources/S
d
Animal
but ubjects Testing, …)
Access Unclassifie Confidentiality
Rights d
19. Stakeholder Drivers per Stage of Information Lifecycle
Stage Actors Legal Constraint Concerns
Research Subjects - Consent/contract - Public benefit
Proposal, - Privacy
Design and - Future access to own
Modeling
Data information
Collection Sources - Intellectual - Business confidentiality
Property - IP
- Contract - Profit from licenses
Funder - Open Access - Public benefit
- Confidentiality - Policy Relevance
- Reproducible Research
- Future access
Primary - Confidentiality - Publication potential
Researcher - Contract - Compliance with
- IP institutional/funder
requirements
Research - Confidentiality - Compliance with funder
Institution - Contract requirements
- IP
Needs for Data Management & Citation - License, IP, confidentiality
19
compliance
20. Stakeholder Drivers per Stage of Information Lifecycle
Stage Actors Legal Constraint Concerns
Data Storage, Primary - Confidentiality - Publication potential
Analysis Researcher - Contract - Compliance with
(Pre-publication) - IP institutional/funder
Modeling
requirements
Research - Confidentiality - License, IP,
Institution - Contract confidentiality
- IP compliance
- Records management
Service - Contract - Contract
Providers - (Selected Cases) - Service business
Confidentiality model
Requirements - Service deployment
Needs for Data Management & Citation 20
21. Stakeholder Drivers per Stage of Information Lifecycle
Stage Actors Legal Constraint Concerns
Publication Primary Compliance for: - Scholarly attribution/credit
Researcher - Source/subjects - Promote use of research
- Sponsor - Track use/impact of research
- Host institution
Modeling
- Publisher
Sponsor - Track research products
- Track compliance
- Track use/impact
Research - Sponsor compliance - Track OA products
Institution - Records management
- Intellectual property
Scholarly - IP - Impact/use
/Journal - Contract - Profit/business model
Publisher - Replicability
Data - IP - Profit/business model
Publisher - Replicability
- Connection to publication
Needs for Data Management & Citation 21
22. Stakeholder Drivers per Stage of Information Lifecycle
Stage Actors Legal Constraint Concerns
Re(use) Research - Access Rights - Provenance
Reader
Modeling
Secondary - Access rights - Replicability
Researcher - Confidentiality - Data reintegration/reanalysis
- Contract - Linking publications and data
- Provenance
“Citizen/Co Access Rights - Data
mmunity redissemination/reanalysis
Scientist” - Linking publications and data
Public Policy Access Rights - Provenance
- Replicability
- Linking publications and data
Education Access Rights - “Classroom” use
/teaching - MOOC use
Needs for Data Management & Citation 22
24. Some Formal “DMP” Requirements
• The Final NIH Statement on Sharing Research Data
– was published in the NIH Guide on February 26, 2003.
“Starting with the October 1, 2003 receipt date, investigators submitting an
NIH application seeking $500,000 or more in direct costs in any single year
Planning
are expected to include a plan for data sharing or state why data sharing is
not possible. “
– No later than the main findings from the final data set are
accepted for publication
• NSF, All proposals must (as of 1/1/2011) include a data
management plan.
– Specific requirements vague, for the most part:
“will be determined by the community of interest through the process of peer review and
program management.”
• Wellcome Trust:
– “ will review data management and sharing plans, and any costs
involved in delivering them, as an integral part of the funding
decision”
Needs for Data Management & Citation 24
25. DMP Goals
• Orchestrate data for current use
• Control disclosure
• Compliance with contracts, regulations, law,
Planing
and policy
• Maximize value of information assets
• Ensure short term and long term
dissemination
Needs for Data Management & Citation 25
26. DMP Elements
• Orchestrate data for current use – Data description
– Quality Assurance – Data value
– Storage, backup, replication, and – Relation to collection
versioning – Relation to evidence base
– Data Formats – Budget
– Data Organization
Planning
– Budget • Ensure short term and long term
– Metadata and documentation dissemination
– Data description
• Control disclosure – Institutional Archiving Commitments
– Access and Sharing – Audience
– Intellectual Property Rights – Access and Sharing
– Legal Requirements – Data Formats
– Security – Data Organization
– Metadata and documentation
• Compliance with contracts, – Budget
regulations, law, and policy
– Access and Sharing
– Adherence
– Responsibility
– Ethics and privacy
– Security
• Value of information assets
Needs for Data Management & Citation 26
27. DMP Details
• Sharing – Restrictions on use
– Plans for depositing in an existing public database • Budget
– Access procedures – Cost of preparing data and documentation
– Embargo periods – Cost of storage and backup
– Access charges – Cost of permanent archiving and access
– Timeframe for access • Intellectual Property Rights
– Technical access methods – Entities who hold property rights
– Restrictions on access – Types of IP rights in data
• Long term access – Protections provided
(Preservation) – Dispute resolution process
–
Planning
Requirements for data destruction, if applicable • Legal Requirements
– Procedures for long term preservation – Provider requirements and plans to meet them
– Institution responsible for long-term costs of data preservation – Institutional requirements and plans to meet them
– Succession plans for data should archiving entity go out of existence • Responsibility
• Formats – Individual or project team role responsible for data management
– Generation and dissemination formats and procedural justification – Qualifications, certifications, and licenses of responsible parties
– Storage format and archival justification • Ethics and privacy
– Format documentation – Informed consent
• Metadata and documentation – Protection of privacy
– Internal and External Identifiers and Citations – Data use agreements
– Metadata to be provided – Other ethical issues
– Metadata standards used • Adherence
– Planned documentation and supporting materials – When will adherence to data management plan be checked or
– Quality assurance procedures for metadata and documentation demonstrated
• Data Organization – Who is responsible for managing data in the project
– File organization – Who is responsible for checking adherence to data management plan
– Naming conventions – Auditing procedures and framework
• Storage, backup, replication, and versioning • Value of information assets
– Facilities – Project use value
– Methods – Institutional audience and uses
– Procedures – Public audience and uses
– Frequency – Relation to institutional collection
– Replication – Relation to disciplinary evidence base
– Version management – Cost of re-creating data
– Recovery guarantees
• Security
– Procedural controls
– Technical Controls
– Confidentiality concerns
– Access control rules
Needs for Data Management & Citation 27
28. Approaching Requirement Overlap
• Sanity-check DMP details with lifecycle questions:
– Who wants it?
Planning
– What do they need it for?
– When will it be used?
• Be conscious of elements that serve multiple goals / or lifecycle
– Metadata/documentation
– Identifiers
– Budgets
– Formats
– IP Rights and confidentiality restrictions
– Responsibilities/Adherence
• Use tracking tools and methods throughout lifecycle
This Way…
Needs for Data Management & Citation 28
30. What do we track?
What tools and methods provide technical leverage or
incentives to management across lifecycle stages and among
actors?
Tracking
• Identification – identifiers, references, citations
• Provenance – relationship of delivered data to history of inputs and
modifications and actors responsible for these ; revision control; versioning
• Authenticity: assertions about the provenance of the records
• Respect des fonds: assertions about the original organization of the records
• Chain of custody: assertions about the ownership of the records
• Integrity: assertions about the management of the records; fixity of bits; fixity of
semantics
• Auditing: verification of properties & policy compliance
Sources: Bulleted list of attributes adapted from Moore 2008
Needs for Data Management & Citation 30
31. Tracking Across Information Lifecycle
Long-term Creation/Collecti
access on
identifiers
Tracking
Storage/I
Re-use
ngest
Metadata for:
Integrity,
Provenance,
citation Custody
External
dissemination/publicati Processing
on
Internal
Analysis
Sharing 31
32. Data Citation: a Point of Leverage
• Services
– Identifiers to specific fixed versions of data are needed to
establish unambiguous chains of provenance
– Identifiers that can be globally resolved to machine-
understandable metadata and to identified object are needed to
Tracking
building generalized access and analysis services
– Persistence of identifiers are needed to maintain long-term
access
• Incentives
– Scholarly credit (intellectual attribution) is a large motivator for
many researchers
– citation creates incentive for researchers to publish data
– Scholars also comply with enforceable journal policies
-- requiring data citation is a light-weight method to make data
access policies auditable
– Impact/usage is a motivator for public research funders – data
citation provides foundation for measures of usage and impact
Needs for Data Management & Citation 32
33. Emerging Practices for Data Citation
• Publishers
– OECD iLibrary
– Thomson Reuters
Tracking
Data Citation Index
• Data archives
– Dataverse Network
– Data-PASS
• Harmonization
efforts
– DataCite
– NAS BRDI
– ICSU/Co-Data
• Discipline specific
Needs for Data Management & Citation 33
34. Identifier and Citation Use Cases
Attribution
• Provide scholarly attribution
• Provide legal attribution
• Identify contributors to data
Verification Discovery
• Associate work with version • Locate data via identifier
of evidence used • Locate data integral to article
• Verify fixity of bits • Locate works related to data
• Verify fixity of information – articles, derivatives,
• Verify “authenticity” of work sources
Access Persistence
• Access to surrogate • Does evidence persists as
long as assertions based on
• On-line access to object
it?
• Machine understandability
• Is durability of evidence
• Long-term understandability transparent?
Needs for Data Management & Citation 34
35. Emerging Principles for Data Citation
• Data citations should be first class objects for publication
-- appear with citations to other works; should be as easy
Tracking
to cite as other works
• Citations should persist and enable access to fixed version of data at least
as long as citing work
• Citations should persist and enable access to fixed version of data at least
as long as the citing work exists.
• Citations should support unambiguous attribution of credit to all contributors,
possibly through the citation ecosystem.
Needs for Data Management & Citation 35
36. Fixity
Tracking
• Are files, bitstreams corrupted?
• Do semantics remain the same over time, across formats, software
analysis systems?
Some semantic approaches…
Universal Numeric Fingerprint - Canonicalization Perceptual Signatures –
Characterization of Significant Properties
Needs for Data Management & Citation 36
37. Audit [aw-dit]:
An independent evaluation of
records and activities to
Tracking
assess a system of controls
Fixity mitigates risk only if used
for auditing.
38. Example:
Functions of Storage Auditing
• Detect
corruption/deletion of content
Tracking
• Verify
compliance with storage/replication
policies
• Prompt
repair actions
39. Audit Design Choices
• Audit regularity and coverage:
on-demand (manually); on event;
randomized sample;
scheduled/comprehensive
Tracking
• Audit procedure, algorithms, certifying
authority
• Auditing scope:
integrity of object; integrity of collection;
integrity of network; policy compliance;
public/transparent auditing
• Trust model
• Threat model
41. Many Tools, Few Solutions
“Poor carpenters blame their tools”
–Proverb
“If all you have is a hammer, everything looks like a nail”
– Another Proverb
“Ultimately, some people need holes – but no one needs a drill. ”
– Yet Another Proverb
Infrastructure
• Many scientific tools are embedded in needs,
perspectives, and practices of specific disciplines
• Identify common requirements
• Identify gaps across lifecycle stages and among actors
Needs for Data Management & Citation 41
42. Core Requirements for Data Sharing Infrastructure
• Stakeholder incentives
– recognition; citation; payment; compliance; services
Infrastructure
• Dissemination
– access to metadata; documentation; data
• Access control
– authentication; authorization; rights management
• Provenance
– chain of control; verification of metadata, bits, semantic content
• Persistence
– bits; semantic content; use
• Legal protection
– rights management; consent; record keeping; auditing
• Usability
– discovery; deposit; curation; administration; collaboration
• Business model
Sources: King 2007; ICSU 2004; NSB 2005
Needs for Data Management & Citation 42
43. Mind the Gaps
Lifecycle Strengths Other Gaps
dissemination
collection
analysis
storage
reuse
Scientific - Close integration across supported - Discipline-centric
lifecycle - Doesn’t address most storage
Workflow
- Perceived as useful service by requirements (replication, access
Software researchers control)
(e.g. Taverna) - High Performance
Storage - Integration across supported lifecycle - Loose integration of analysis,
- Storage is perceived as useful service insufficient for reproducibility
Grid/VRE
by researchers
(e.g. Irods) - High performance performance
Institutional - Low cost - Access and discovery mechanisms
- Institutional commitment to long- usually tailored to publications, not
Repository data
term access
(e.g. Dspace)
Reproducible - Close integration of analysis and - Addresses replication but not
scientific publication reuse for secondary analysis,
Publications integration
- Reduces risk of embarrassment
Systems when working with “co-authors”
(e.g. StatWeave) - Ensures one form of reproducibility
(calibration, mechanical replicability)
“Data Archive” - Richer support for reuse - Varied models – curated database;
- Often supports cross-discipline “virtual archive”, disciplinary
discovery; long-term access repository
- Often discipline-centric
Needs for Data Management & Citation 43
45. • Audit Data Replication & Integrity
Policies
Automatic Auditing of Data
Examplars
Replication & Integrity
Policies
safearchive.org
Needs for Data Management & Citation 45
46. The Distributed Content Replication Problem
• We hold digital assets we A Partial Solution: LOCKSS
Self-contained OSS
wish to preserve
Harvests resources via open
• Many of these assets are interfaces
not replicated
Replicated through secure P2P
• Even when replicated, protocol
vulnerable to single Self-repairing
Examplars
points of failure because Zero trust
replicas are managed by Used by hundreds of institution
single institution for collaborative preservation
What we needed…
Auditing – how many replicates exist, where & are they
current?
Policy – prove replication are consistent with policy, like
TRAC?
Collaboration – coordinateforwith partners to replicate content?46
Needs Data Management & Citation
47. Resilience of peer-to-peer with
the Accountability of centralized system
Examplars
Facilitating collaborative replication and preservation with cyberinfrastructure …
• Collaborators declare explicit non-uniform resource commitments
• Policy records and schematizes commitments, desired TRAC replication properties
• Storage layer provides replication, integrity, freshness, versioning
• SafeArchive software provides monitoring, auditing, transparency, and provisioning
• Content is harvested through HTTP (LOCKSS) or OAI-PMH
• Integration of LOCKSS, Institutional Repositories, TRAC
Needs for Data Management & Citation 47
48. ORCID is an international, interdisciplinary, open, and not-for-profit
organization created for the benefit of all stakeholders, including research
Examplars
institutions, funding organizations, publishers, and researchers to enhance
the scientific discovery process and improve collaboration and the efficiency
of research funding.
ORCID aims to solve the name ambiguity problem in scholarly
communications by creating a registry of persistent unique identifiers for
individual researchers and an open and transparent linking mechanism
between ORCID, other ID schemes, and research objects such as publications,
grants, and patents.
http://orcid.org
Needs for Data Management & Citation 48
49. ORCID Launch to Public in October
ORCID Launch Partners Program include research institutions, publishers, research funders, data
repositories, and third party providers, such as:
The American Physical Society, Aries Systems, Avedas, Boston University, the California Institute of
Technology, CrossRef, Elsevier, Faculty of 1000, figshare, Hindawi Publishing Corporation, KNODE, Nature
Publishing Group, SafetyLit, Symplectic, Thomson Reuters, Total-Impact, and Wellcome Trust.
Examplars
At Launch, the ORCID Registry will:
• Allow researchers and scholars to register for an ORCID identifier, create ORCID records, and
manage their privacy settings
• Contain ORCID records created by universities on behalf of their researchers and scholars
• Allow researchers and scholars to link their ORCID record external identifiers, including Scopus
and ResearcherID
• Facilitate synchronization of ORCID identifier record data with external systems including
Scopus
• Bi-directionally link to a number of author profile and manuscript submission, including the
American Physical Society, Aries Systems, Hindawi Publishing Corporation, Nature Publishing
Group, and Scholar One Manuscripts
• Allow researchers and scholars to search and upload publication metadata from CrossRef
• (Soon after launch) have the ability to link to grant application systems
Needs for Data Management & Citation 49
50. Data Management Workflows
for Open Access Journals
Examplars
+
http://bit.ly/DVNOJS
Needs for Data Management & Citation 50
51. Embed Real Data Archives in Journals
• Embed remotely managed
data archive in OJS journal
• Replaces “supplemental
materials”
• Ads
– Online analysis
Examplars
– Independent storage
– Persistent identifiers and
citation
– Data versioning
– Enhanced discoverability
and interoperability
– Format normalization
– Fixity and replication
Needs for Data Management & Citation 51
52. Integrated Policies, Workflow, Access
• OJS and DVN
– Support workflows
– Enforce policies
– Disseminate content
• Integrate policies for
– Access and data license
Examplars
– Embargoes
– Citation
• Coordinate
– Submission
– Review
– Publication
• Link
– Content
– Subscriptions & notifications
– Usage Metrics
Needs for Data Management & Citation 52
54. How will we see the geography of science e,
when we reveal how research connects through
data?
Research & Node Layout: Kevin Boyack and Dick
Klavans (mapofscience.com); Data: Thompson ISI;
Graphics & Typography: W. Bradford Paley
(didi.com/brad); Commissioned Katy Börner
(scimaps.org)
Seed Magazine, Mar 7, 2007
http://seedmagazine.com/content/article/scientific_m
ethod_relationships_among_scientific_paradigms/
Needs for Data Management & Citation 54
55. Summary
• Principled approach to data management
– Follow information through information lifecycle
– Assess stakeholder requirements
– Track management, use, impact across lifecycle
• Data management planning goals
– Orchestrate data for current use
– Protect against disclosure
– Compliance with contracts, regulations, law, and policy
– Maximize value of information assets
– Ensure short term and long term dissemination
• Lifecycle data management tracking
– Identification – identifiers, references, citations
– Provenance – relationship of delivered data to history of inputs and modifications and actors responsible for
these
– Authenticity: assertions about the provenance of the records
– Chain of custody: assertions about the ownership of the records
– Integrity: assertions about the management of the records; fixity of bits; fixity of semantics
– Auditing: verification of properties & policy compliance
• Data citation is a key leverage point
– Services: establish provenance; access; long-term preservation
– Incentives: scholarly credit; reproducible research policies; impact/usage analysis
– Data citations should be first class objects for publication -- appear with citations to other works;
should be as easy to cite as other work
Needs for Data Management & Citation 55
56. Additional References
• Buckheit J, Donoho DL. Wavelab and reproducible research. In:
Antoniadis A, editor. Wavelets and Statistics. New York, NY:
Springer; 1995. p. 55-81.
• International Council For Science (ICSU) 2004. ICSU Report of the
CSPR Assessment Panel on Scientific Data and Information. Report.
• King, Gary. 2007. "An Introduction to the Dataverse Network as an
Infrastructure for Data Sharing." Sociological Methods and Research
36
• Moore, M. 2008, Towards a Theory of Digital Preservation,
International Journal of Digital Curation 1(3)
• National Science Board (NSB), 2005, Long-Lived Digital Data
Collections: Enabling Research and Education in the 21rst Century,
NSF. (NSB-05-40).
Needs for Data Management & Citation 56
This work. by Micah Altman (http://micahaltman.com) is licensed under the Creative Commons Attribution-Share Alike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/us/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
Most of the different stakeholders have stronger relationships/stakes with research at different stages. But researchers and research institutions are in the middle – they have a strong stake in most stagesResearchers are more directly concerned with collection, processing, analysis, dissemination. Organizations have a higher stake in internal sharing, re-use, long-term access.
This section is an a more detailed deep-dive into drivers at major stages of the information lifecycle. It is not intended to be part of the main presentation – but could be used to respond to questions, or to focus on a particular stage.