SlideShare una empresa de Scribd logo
1 de 26
Data curation issues Michelle Hudson SCOPA Forum 9.21.11
What is data?
What is data? Definition varies by discipline and can include experimental, observational, and computational data.
What is data? Definition varies by discipline and can include experimental, observational, and computational data. In general “research data” refers to raw or processed products of a research project.
What is data? Definition varies by discipline and can include experimental, observational, and computational data. In general “research data” refers to raw or processed products of a research project. These products can be video, images, or numeric files in the form of geographic information, spreadsheets, and other formats.
What is data curation?
What is data curation? “Data curation is the active and ongoing management of research data through its lifecycle of interest and usefulness to scholarship, science, and education.” – Carole Palmer, UIUC GSLIS
What is data curation? “Data curation is the active and ongoing management of research data through its lifecycle of interest and usefulness to scholarship, science, and education.” – Carole Palmer, UIUC GSLIS “Curation” includes selection, appraisal, maintenance, preservation.
Why is data curation important for us?
Why is data curation important for us? According to Paul F. Uhlir, Director of the Board on Research Data and Information, researchers are “contributing to a networked information enterprise where data are a fundamental infrastructural component of the modern research system.”
Why is data curation important for us? According to Paul F. Uhlir, Director of the Board on Research Data and Information, researchers are “contributing to a networked information enterprise where data are a fundamental infrastructural component of the modern research system.” Increasingly, data itself is a product and record of scholarship.
Some Problems that make curation difficult.
Some Problems that make curation difficult. No standards.
Some Problems that make curation difficult. No standards. Lack of interoperability.
Some Problems that make curation difficult. No standards. Lack of interoperability. Controlled vocabularies are missing.
Some Problems that make curation difficult. No standards. Lack of interoperability. Controlled vocabularies are missing. Storage space is limited.
Some Problems that make curation difficult. No standards. Lack of interoperability. Controlled vocabularies are missing. Storage space is limited. Domain of stewardship/responsibility is unclear.
Some Problems that make curation difficult. No standards. Lack of interoperability. Controlled vocabularies are missing. Storage space is limited. Domain of stewardship/responsibility is unclear. Individual repositories make silos of content.
Ideas for solutions!
Ideas for solutions! Experiment tracking software and electronic notebooks.
Ideas for solutions! Experiment tracking software and electronic notebooks. Automatic metadata.
Ideas for solutions! Experiment tracking software and electronic notebooks. Automatic metadata. Integrating curation early into the researcher workflow.
Ideas for solutions! Experiment tracking software and electronic notebooks. Automatic metadata. Integrating curation early into the researcher workflow. Educating graduate students on proper data management.
Ideas for solutions! Experiment tracking software and electronic notebooks. Automatic metadata. Integrating curation early into the researcher workflow. Educating graduate students on proper data management. DataONE and the Data Conservancy.
Other stuff! Data citation Data sharing Reward models Identity control (ORCID, EZID) Semantic web and linked data Cyberinfrastructure
Questions? michelle.hudson@yale.edu 203.432.4587 @michellehudson in person for coffee @ kbt cafe

Más contenido relacionado

La actualidad más candente

Research Data Management from a Software Engineering Perspective
Research Data Management from a Software Engineering PerspectiveResearch Data Management from a Software Engineering Perspective
Research Data Management from a Software Engineering PerspectiveSarah Anna Stewart
 
Introduction to research data management; Lecture 01 for GRAD521
Introduction to research data management; Lecture 01 for GRAD521Introduction to research data management; Lecture 01 for GRAD521
Introduction to research data management; Lecture 01 for GRAD521Amanda Whitmire
 
RDAP 16 Poster: Challenges and Opportunities in an Institutional Repository S...
RDAP 16 Poster: Challenges and Opportunities in an Institutional Repository S...RDAP 16 Poster: Challenges and Opportunities in an Institutional Repository S...
RDAP 16 Poster: Challenges and Opportunities in an Institutional Repository S...ASIS&T
 
Virtual Research Environments
Virtual Research EnvironmentsVirtual Research Environments
Virtual Research EnvironmentsJoss Winn
 
SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...Natalie Stanford
 
Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014Jisc
 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Jian Qin
 
Presentation to the UM Library Emergent Research Series
Presentation to the UM Library Emergent Research SeriesPresentation to the UM Library Emergent Research Series
Presentation to the UM Library Emergent Research SeriesSEAD
 
Data Science and What It Means to Library and Information Science
Data Science and What It Means to Library and Information ScienceData Science and What It Means to Library and Information Science
Data Science and What It Means to Library and Information ScienceJian Qin
 
Disciplinary RDM
Disciplinary RDMDisciplinary RDM
Disciplinary RDMSarah Jones
 
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
 
Managing data responsibly to enable research interity
Managing data responsibly to enable research interityManaging data responsibly to enable research interity
Managing data responsibly to enable research interityIUPUI
 
Data management (1)
Data management (1)Data management (1)
Data management (1)SM Lalon
 
Data publishing at the UQ Library
Data publishing at the UQ LibraryData publishing at the UQ Library
Data publishing at the UQ LibraryARDC
 

La actualidad más candente (20)

Kno.e.sis Review: late 2012 to mid 2013
Kno.e.sis Review: late 2012 to mid 2013Kno.e.sis Review: late 2012 to mid 2013
Kno.e.sis Review: late 2012 to mid 2013
 
Research Data Management from a Software Engineering Perspective
Research Data Management from a Software Engineering PerspectiveResearch Data Management from a Software Engineering Perspective
Research Data Management from a Software Engineering Perspective
 
Knoesis Student Achievement
Knoesis Student AchievementKnoesis Student Achievement
Knoesis Student Achievement
 
Introduction to research data management; Lecture 01 for GRAD521
Introduction to research data management; Lecture 01 for GRAD521Introduction to research data management; Lecture 01 for GRAD521
Introduction to research data management; Lecture 01 for GRAD521
 
RDAP 16 Poster: Challenges and Opportunities in an Institutional Repository S...
RDAP 16 Poster: Challenges and Opportunities in an Institutional Repository S...RDAP 16 Poster: Challenges and Opportunities in an Institutional Repository S...
RDAP 16 Poster: Challenges and Opportunities in an Institutional Repository S...
 
Virtual Research Environments
Virtual Research EnvironmentsVirtual Research Environments
Virtual Research Environments
 
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
 
SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...SEEKing our way to better presentation of data and models from scientific inv...
SEEKing our way to better presentation of data and models from scientific inv...
 
Sgci data west 12-15-16
Sgci data west 12-15-16Sgci data west 12-15-16
Sgci data west 12-15-16
 
Summary of 3DPAS
Summary of 3DPASSummary of 3DPAS
Summary of 3DPAS
 
Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014Why science needs open data – Jisc and CNI conference 10 July 2014
Why science needs open data – Jisc and CNI conference 10 July 2014
 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...
 
Presentation to the UM Library Emergent Research Series
Presentation to the UM Library Emergent Research SeriesPresentation to the UM Library Emergent Research Series
Presentation to the UM Library Emergent Research Series
 
Data Science and What It Means to Library and Information Science
Data Science and What It Means to Library and Information ScienceData Science and What It Means to Library and Information Science
Data Science and What It Means to Library and Information Science
 
Disciplinary RDM
Disciplinary RDMDisciplinary RDM
Disciplinary RDM
 
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
 
Sgci data west 12-15-16
Sgci data west 12-15-16Sgci data west 12-15-16
Sgci data west 12-15-16
 
Managing data responsibly to enable research interity
Managing data responsibly to enable research interityManaging data responsibly to enable research interity
Managing data responsibly to enable research interity
 
Data management (1)
Data management (1)Data management (1)
Data management (1)
 
Data publishing at the UQ Library
Data publishing at the UQ LibraryData publishing at the UQ Library
Data publishing at the UQ Library
 

Destacado

Data curation issues for repositories
Data curation issues for repositoriesData curation issues for repositories
Data curation issues for repositoriesChris Rusbridge
 
Current and emerging scientific data curation practices
Current and emerging scientific data curation practicesCurrent and emerging scientific data curation practices
Current and emerging scientific data curation practicesMichael Day
 
Curation Service Models - Michael Witt - RDAP12
Curation Service Models - Michael Witt - RDAP12Curation Service Models - Michael Witt - RDAP12
Curation Service Models - Michael Witt - RDAP12ASIS&T
 
Data curation and preservation: the Digital Curation Centre
Data curation and preservation: the Digital Curation CentreData curation and preservation: the Digital Curation Centre
Data curation and preservation: the Digital Curation CentreMichael Day
 
Data Curation Models JHU Barbara Pralle RDAP12
Data Curation Models JHU Barbara Pralle RDAP12Data Curation Models JHU Barbara Pralle RDAP12
Data Curation Models JHU Barbara Pralle RDAP12ASIS&T
 
Data Curation: Retooling the Existing Workforce
Data Curation: Retooling the Existing WorkforceData Curation: Retooling the Existing Workforce
Data Curation: Retooling the Existing WorkforceSteven Miller
 

Destacado (6)

Data curation issues for repositories
Data curation issues for repositoriesData curation issues for repositories
Data curation issues for repositories
 
Current and emerging scientific data curation practices
Current and emerging scientific data curation practicesCurrent and emerging scientific data curation practices
Current and emerging scientific data curation practices
 
Curation Service Models - Michael Witt - RDAP12
Curation Service Models - Michael Witt - RDAP12Curation Service Models - Michael Witt - RDAP12
Curation Service Models - Michael Witt - RDAP12
 
Data curation and preservation: the Digital Curation Centre
Data curation and preservation: the Digital Curation CentreData curation and preservation: the Digital Curation Centre
Data curation and preservation: the Digital Curation Centre
 
Data Curation Models JHU Barbara Pralle RDAP12
Data Curation Models JHU Barbara Pralle RDAP12Data Curation Models JHU Barbara Pralle RDAP12
Data Curation Models JHU Barbara Pralle RDAP12
 
Data Curation: Retooling the Existing Workforce
Data Curation: Retooling the Existing WorkforceData Curation: Retooling the Existing Workforce
Data Curation: Retooling the Existing Workforce
 

Similar a data curation issues

Research process and research data management
Research  process and research data managementResearch  process and research data management
Research process and research data managementKen Chad Consulting Ltd
 
UKSG 2014 Breakout Session - Westminster Research Process and Research Data
UKSG 2014 Breakout Session - Westminster Research Process and Research DataUKSG 2014 Breakout Session - Westminster Research Process and Research Data
UKSG 2014 Breakout Session - Westminster Research Process and Research DataUKSG: connecting the knowledge community
 
Fsci 2018 monday30_july_am6
Fsci 2018 monday30_july_am6Fsci 2018 monday30_july_am6
Fsci 2018 monday30_july_am6ARDC
 
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017ARDC
 
Coming to an Understanding: a Cross-institutional Examination of Assessments ...
Coming to an Understanding: a Cross-institutional Examination of Assessments ...Coming to an Understanding: a Cross-institutional Examination of Assessments ...
Coming to an Understanding: a Cross-institutional Examination of Assessments ...Stephanie Wright
 
Publishing your research: Research Data Management (Introduction)
Publishing your research: Research Data Management (Introduction) Publishing your research: Research Data Management (Introduction)
Publishing your research: Research Data Management (Introduction) Jamie Bisset
 
Research Data Management and Librarians
Research Data Management and LibrariansResearch Data Management and Librarians
Research Data Management and LibrariansJohann van Wyk
 
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...University of California Curation Center
 
Open Data in Slovenia: An assessment of Accountability among Stakeholders, 2012
Open Data in Slovenia: An assessment of Accountability among Stakeholders, 2012Open Data in Slovenia: An assessment of Accountability among Stakeholders, 2012
Open Data in Slovenia: An assessment of Accountability among Stakeholders, 2012Arhiv družboslovnih podatkov
 
Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...
Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...
Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...Globus
 
Scientific Information Management at the U.S. Geological Survey
Scientific Information Management at the U.S. Geological SurveyScientific Information Management at the U.S. Geological Survey
Scientific Information Management at the U.S. Geological SurveyDave Govoni
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data ManagementJamie Bisset
 
Neville Prendergast "E-Science - What is it?"
Neville Prendergast "E-Science - What is it?"Neville Prendergast "E-Science - What is it?"
Neville Prendergast "E-Science - What is it?"The TMC Library
 
Curation of Research Data
Curation of Research DataCuration of Research Data
Curation of Research DataMichael Day
 
Mind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeMind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeLizLyon
 
Research data management for masters and ph d students
Research data management for masters and ph d studentsResearch data management for masters and ph d students
Research data management for masters and ph d studentsDebs Martindale
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeLizLyon
 

Similar a data curation issues (20)

Research process and research data management
Research  process and research data managementResearch  process and research data management
Research process and research data management
 
UKSG 2014 Breakout Session - Westminster Research Process and Research Data
UKSG 2014 Breakout Session - Westminster Research Process and Research DataUKSG 2014 Breakout Session - Westminster Research Process and Research Data
UKSG 2014 Breakout Session - Westminster Research Process and Research Data
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
Fsci 2018 monday30_july_am6
Fsci 2018 monday30_july_am6Fsci 2018 monday30_july_am6
Fsci 2018 monday30_july_am6
 
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
 
Coming to an Understanding: a Cross-institutional Examination of Assessments ...
Coming to an Understanding: a Cross-institutional Examination of Assessments ...Coming to an Understanding: a Cross-institutional Examination of Assessments ...
Coming to an Understanding: a Cross-institutional Examination of Assessments ...
 
Publishing your research: Research Data Management (Introduction)
Publishing your research: Research Data Management (Introduction) Publishing your research: Research Data Management (Introduction)
Publishing your research: Research Data Management (Introduction)
 
Martone grethe
Martone gretheMartone grethe
Martone grethe
 
METRO RDM Webinar
METRO RDM WebinarMETRO RDM Webinar
METRO RDM Webinar
 
Research Data Management and Librarians
Research Data Management and LibrariansResearch Data Management and Librarians
Research Data Management and Librarians
 
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
 
Open Data in Slovenia: An assessment of Accountability among Stakeholders, 2012
Open Data in Slovenia: An assessment of Accountability among Stakeholders, 2012Open Data in Slovenia: An assessment of Accountability among Stakeholders, 2012
Open Data in Slovenia: An assessment of Accountability among Stakeholders, 2012
 
Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...
Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...
Virtual Organizations 2.0: Social Constructs for Data-centered Collaborative ...
 
Scientific Information Management at the U.S. Geological Survey
Scientific Information Management at the U.S. Geological SurveyScientific Information Management at the U.S. Geological Survey
Scientific Information Management at the U.S. Geological Survey
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
Neville Prendergast "E-Science - What is it?"
Neville Prendergast "E-Science - What is it?"Neville Prendergast "E-Science - What is it?"
Neville Prendergast "E-Science - What is it?"
 
Curation of Research Data
Curation of Research DataCuration of Research Data
Curation of Research Data
 
Mind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and PracticeMind the Gap: Reflections on Data Policies and Practice
Mind the Gap: Reflections on Data Policies and Practice
 
Research data management for masters and ph d students
Research data management for masters and ph d studentsResearch data management for masters and ph d students
Research data management for masters and ph d students
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decade
 

Último

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Último (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

data curation issues

  • 1. Data curation issues Michelle Hudson SCOPA Forum 9.21.11
  • 3. What is data? Definition varies by discipline and can include experimental, observational, and computational data.
  • 4. What is data? Definition varies by discipline and can include experimental, observational, and computational data. In general “research data” refers to raw or processed products of a research project.
  • 5. What is data? Definition varies by discipline and can include experimental, observational, and computational data. In general “research data” refers to raw or processed products of a research project. These products can be video, images, or numeric files in the form of geographic information, spreadsheets, and other formats.
  • 6. What is data curation?
  • 7. What is data curation? “Data curation is the active and ongoing management of research data through its lifecycle of interest and usefulness to scholarship, science, and education.” – Carole Palmer, UIUC GSLIS
  • 8. What is data curation? “Data curation is the active and ongoing management of research data through its lifecycle of interest and usefulness to scholarship, science, and education.” – Carole Palmer, UIUC GSLIS “Curation” includes selection, appraisal, maintenance, preservation.
  • 9. Why is data curation important for us?
  • 10. Why is data curation important for us? According to Paul F. Uhlir, Director of the Board on Research Data and Information, researchers are “contributing to a networked information enterprise where data are a fundamental infrastructural component of the modern research system.”
  • 11. Why is data curation important for us? According to Paul F. Uhlir, Director of the Board on Research Data and Information, researchers are “contributing to a networked information enterprise where data are a fundamental infrastructural component of the modern research system.” Increasingly, data itself is a product and record of scholarship.
  • 12. Some Problems that make curation difficult.
  • 13. Some Problems that make curation difficult. No standards.
  • 14. Some Problems that make curation difficult. No standards. Lack of interoperability.
  • 15. Some Problems that make curation difficult. No standards. Lack of interoperability. Controlled vocabularies are missing.
  • 16. Some Problems that make curation difficult. No standards. Lack of interoperability. Controlled vocabularies are missing. Storage space is limited.
  • 17. Some Problems that make curation difficult. No standards. Lack of interoperability. Controlled vocabularies are missing. Storage space is limited. Domain of stewardship/responsibility is unclear.
  • 18. Some Problems that make curation difficult. No standards. Lack of interoperability. Controlled vocabularies are missing. Storage space is limited. Domain of stewardship/responsibility is unclear. Individual repositories make silos of content.
  • 20. Ideas for solutions! Experiment tracking software and electronic notebooks.
  • 21. Ideas for solutions! Experiment tracking software and electronic notebooks. Automatic metadata.
  • 22. Ideas for solutions! Experiment tracking software and electronic notebooks. Automatic metadata. Integrating curation early into the researcher workflow.
  • 23. Ideas for solutions! Experiment tracking software and electronic notebooks. Automatic metadata. Integrating curation early into the researcher workflow. Educating graduate students on proper data management.
  • 24. Ideas for solutions! Experiment tracking software and electronic notebooks. Automatic metadata. Integrating curation early into the researcher workflow. Educating graduate students on proper data management. DataONE and the Data Conservancy.
  • 25. Other stuff! Data citation Data sharing Reward models Identity control (ORCID, EZID) Semantic web and linked data Cyberinfrastructure
  • 26. Questions? michelle.hudson@yale.edu 203.432.4587 @michellehudson in person for coffee @ kbt cafe

Notas del editor

  1. Hi! I’m Michelle Hudson. I’m the science and social science librarian working in the social science library for now, but I’ll be in the new Center for Science and Social Science Information when we open in January up in the Kline Biology Tower.I started at Yale in April and I’ve been going to conferences and workshops all summer – today I’m going to talk about things I learned at the Summer Institute on Data Curation put on by UIUC’s GSLIS and the NSF-funded Princeton Research Data Lifecycle Workshop.
  2. Director of CIRSS -- Center for Informatics Research in Science & Scholarship
  3. As well as many other things – too many to list.
  4. No standards for storage or metadata either within or across disciplines in many cases.
  5. Poor interoperability makes it difficult to maintain or convert data to appropriate formats – open source software and standards are lacking.
  6. Even simple ones to standardize the outcomes of experiments – people don’t talk about things the same way.
  7. Some data are cheaper to generate than to store. That’s not to say they’re cheap to generate – it’s just that adequate backed-up storage is still unreasonably expensive for most research enterprises.
  8. Domain of stewardship/responsibility - whose job is it to keep university assets for thirty, forty years or more? liaison models and services to researchers. Who can give this authority and who is willing to fill the gap? As a discipline, we don’t have liaison models in place. Is it the library? Is it IT?
  9. So Cornell has a repository with a lot of great data they’re willing to share. This isn’t helpful unless we know data sets are stored there. It’s even less helpful if we can’t access them. This is true for all kinds of universities and centers all over the world.
  10. Utilizing software that tracks and records the progress of experiments and keeps data as it happens is something that’s gaining more popularity. Every scientist uses a notebook, and, until recently, these notebooks were paper and shelved somewhere inaccessible or forgotten once the project was done. With electronic notebooks, it’ll be easier to keep track of what’s done to the data, when, etc.
  11. Automatic metadata application came up as a main point during the Princeton workshop. Scientists want to advocate for instrument makers to give more robust options for creating metadata at the point of data creation/observation.
  12. If we start early in the workflow, it’s easier to get usable data we can decide to select and keep with less work down the line. If data is taken care of early with the aim of keeping it safe for a long time, it would require less work on the part of the curators.
  13. Some institutions are beginning to educate grad students on good data management techniques. Simple data or file management isn’t often taught as part of a research methods scope, so students have been welcoming this information, and see librarians as experts on this kind of thing.
  14. DataONE and the Data Conservancy are two NSF-funded projects with important goals for nation-wide cyberinfrastructure and research data storage. DataONE intends to connect “nodes” of research institutions to make the data they share widely available. They also have a mission to educate students, librarians, and researchers on data management best practices. The Data Conservancy is a model for a large-scale repository for research data. Instances will start rolling out as early as next year and it’ll be interesting to see how it develops and if it’s viable. You can read a lot more about it on their website, which is on your handout.
  15. Feel free to talk to me about anything else related to data as well. Obviously not much time to cover it all today, but let’s meet and chat! Also refer to the handout for good resources on learning more.