Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Research Data Infrastructure for Geochemistry (DFG Roundtable)

497 visualizaciones

Publicado el

This presentation provides an overview of different aspects of data management for geochemistry and resources available at the EarthChem@IEDA data facility.

Publicado en: Ciencias
  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Research Data Infrastructure for Geochemistry (DFG Roundtable)

  1. 1. Research Data Infrastructure for Geochemistry iedadata.or g 1
  2. 2. 2 Investment 2 IEDA 2016-2021: Operation of a Multi-Disciplinary Data Facility for the Earth Science Community • Invited renewal proposal after IEDA review in 2014/15 • Next 5 years of operating IEDA • $14.4 million
  3. 3. IEDA Data Systems for Geochemistry 3
  4. 4. 4 IEDA / EarthChem  Community driven  Community governance  Community engagement & training  Standards compliant (accredited ‘trustworthiness’)  Follow data curation standards  QA/QC procedures  Unique, persistent identification of data  Persistent access of data holdings  Operational procedures (risk management, IP, etc.)  Demonstrated impact on science 4
  5. 5. 5 Scientific Justification  enable new data intensive science, new cross-disciplinary studies, and new kinds of collaborations.  expand opportunities for scientists, educators, and the public to participate in science.  maximize the return on national research investments.  ensure reproducible science: permit verification of research results.  contribute to new science initiatives. 5 “Data collections provide more than an increase in the efficiency and accuracy of research: they enable new research opportunities.” Long-lived Digital Data Collections: Enabling Research and Education in the 21st Century” (NSB Report, September 2005)
  6. 6. 6 Science from EarthChem Data Systems
  7. 7. 7Gale et al.
  8. 8. 8Gale et al.
  9. 9. 9 Data Policies December 11, 2013 9  Agencies  Societies  Journals May 9, 2013 February 22, 2013
  10. 10. 10 Data Policies December 11, 2013 10
  11. 11. Concern: Reproducibility 11 “The field sciences (e.g., geology, ecology, and archaeology), where each study is temporally (and often spatially) unique, provide exemplars for the importance of preserving data and samples for further analysis.”
  12. 12. 12 Data Policies: December 11, 2013 12
  13. 13. COPDESS Coalition for Publishing Data in the Earth & Space Sciences “Connecting Earth Science publishers and Data Facilities to help translate the aspirations of open, available, and useful data from policy into practice.”
  14. 14. 14 Data: Publishers’ Perspective  Many have had supplements for some time.  Difficult to deal with, costly  PDF’s mostly (not searchable, poorly indexed, variable quality)  Require authors to comply with data availability policy; policing  Little guidance on community standards  Want to use and promote repositories, but not well integrated except for a few exceptions  Worried about repository funding and stability 14 Slide courtesy of Brooks Hanson, AGU Director for Publications
  15. 15. 15 Statement of Commitment COPDESS.org  reaffirm and ensure adherence to our existing journal and publishing policies…regarding data sharing and archiving...  Signed by ~50 publishers & data facilities  “Earth and space science data should, to the greatest extent possible, be stored in appropriate domain repositories that ... follow leading practices, and can provide additional data services.”  released 15 January. Article in Eos.org  https://eos.org/agu-news/committing-publishing-data-earth-space- sciences 15
  16. 16. https://copdessdirectory.osf.io/ To be integrated with re3data.org
  17. 17. Domain-specific Data Facilities 17 Science Community Domain specific Data facility 17 Discipline-specific data services • Context & provenance metadata • Semantics • Workflows Data curation services CI development
  18. 18. 18 findable identification, persistence accessible protection, protocols context, provenance re-usable harmonized, machine-readable interoperable 1/6/16ESIP Winter 2016: "Unleashing the BIG in Small Data" Data Curation Standards Domain-specific Data Standards
  19. 19. 19 findable identification, persistence accessible protection, protocols context, provenance re-usable harmonized, machine-readable interoperable
  20. 20. Unleashing the BIG in small Research Data Kerstin Lehnert Lamont -Doherty Earth Observatory of Columbia University Palisades, NY, 10964 http://bigdata-madesimple.com/hey-big-data-dont-forget-your-little-data-cousin/
  21. 21. 21 Small Data: Pieces of a Puzzle … 1/6/16ESIP Winter 2016: "Unleashing the BIG in Small Data" 21
  22. 22. 1/6/16ESIP Winter 2016: "Unleashing the BIG in Small Data" 22 … that build a picture
  23. 23. Small Data, Big Science: Example 1 23 1/6/16ESIP Winter 2016: "Unleashing the BIG in Small Data" “Understanding where the dust that's in the atmosphere and oceans comes from can help scientists estimate its impact on earth's climate system.” Bess Koffman, Michael Kaplan, Steven Goldstein, Gisela Winckler (LDEO), Natalie Mahowald (Cornell) http://blogs.ei.columbia.edu/2014/03/13/did-new-zealand-dust-influence-the-last-ice-age/ Science Question: Did New Zealand Dust Influence the Last Ice Age?
  24. 24. Small Data - Big Effort or What it takes to generate a few kilobytes of data ESIP Winter 2016: "Unleashing the BIG in Small Data" 24 1/6/16
  25. 25. 25 Small Data, Big Science: Example 2 1/6/16ESIP Winter 2016: "Unleashing the BIG in Small Data" 25 Science question: Do convergent margin volcanoes really represent continental crust? “As it is crucial to understand the extent and origin of the compositional difference between central Aleutian lavas and plutons through time and space, this project will map and sample plutonic rocks exposed on the central Aleutians and their coeval volcanic host rocks.” http://www.nsf.gov/discoveries/disc_summ.jsp?cntn_id=135851&org=NSF
  26. 26. Small Data - Big Effort or What it takes to generate a few kilobytes of data 1/6/16ESIP Winter 2016: "Unleashing the BIG in Small Data" 26 • 4 scientists (3 institutions) traveling to Alaska • 5 weeks on remote islands • a boat (with crew) • a helicopter Anticipated Data: • ~ 250 samples • ~ 200 major element analyses • ~ 150 trace element analyses • 50 U/Pb zircon geochronology • 30 Ar-Ar ages • 80 Sr, Nd, Hf and Pb isotope analyses
  27. 27. 27
  28. 28. EarthChem Data Systems 28 Data Data Data Data Data EarthChem Library Data Data Data Data Data PetDB, SedDB EarthChem Portal Data Publication & Preservation Data Mining & Analysis Investigators Metadata Catalog Data & Metadata Data & Metadata External Systems EarthChem Data Managers
  29. 29. EarthChem Library 29 Data Types: - Analytical datasets - Experimental datasets - Macros/tools - Data compilations (syntheses) - Images - Data reports
  30. 30. 30DOI to allow proper citation of data Link to publications Link to funding source 30
  31. 31. 31 Accessible in the EarthChem Library
  32. 32. 32 Editors Roundtable Recommendations  Data need to be available in useful format  Complete disclosure of data  Data in tabular (usable!) format, no .PDF or .jpg  No ratios  Sample metadata  locations  Unique sample identifiers  Object classifications  Analytical metadata  Method  Lab  Data quality & reproducibility (reference material measurements)
  33. 33. 33 Data Templates LPSC 2015 Workshop: Restoration and Synthesis of Planetary Geochemical Data 33
  34. 34. EarthChem Data Templates 34
  35. 35. 36
  36. 36. 37 Data Standards: Why?  Re-usability of data  Reproducibility of science  Integration/interoperability of data
  37. 37. 38 Open Geospatial Consortium (OGC): Observations & Measurements 38 Sampling Observation “Observations commonly involve sampling of an ultimate feature of interest. This International Standard defines a common set of sampling feature types classified primarily by topological dimension, as well as samples for ex-situ observations.” (OGC O&M 2.0.0 / ISO19156; editor: Simon Cox) e.g. Station, Transect, Section, Specimen
  38. 38. Observation Data Model v2 39 ODM2 Team: J S Horsburgh A K Aufdenkampe L Hsu A Jones K Lehnert E Mayorga L Song D Tarboton I Zaslavsky Horsburgh et al., Environmental Modelling & Software, Volume 79, 2016.
  39. 39. PetDB 40
  40. 40. 41 PetDB Data Mining: Search & Filter 1/6/16ESIP Winter 2016: "Unleashing the BIG in Small Data" 41 Filter by method or concentration
  41. 41. ESIP Winter 2016: "Unleashing the BIG in Small Data" 42
  42. 42. 43 EarthChem Collaborations  External EC Portal contributors  GEOROC, USGS, MetPetDB, GANSEKI  Critical Zone Observatories  DiamondDB (funded by Sloan Foundation/DCO)  DECADE Portal (funded by Sloan Foundation/DCO)  Collaboration with Global Volcanism Program & MAGA database (C. Cardellini)  Layered Intrusions Database  J. van Tongeren (student engagement project)  MoonDB (funded by NASA 2015-2017)  Johnson Space Center, C. Neal, 43
  43. 43. 44 IEDA Data Rescue Initiative  Data Rescue Mini-awards ($7,000)  J. Delano (SUNY Albany), A. Saal, E. Hauri: Apollo samples  J. Gill (UCSC, retired):  P. Janney (UCT): UCT Mantle Xenolith Collection  M. Rhodes (U Mass): Hawaiian Drilling project  T. Fischer (UNM): Russian Volcanic Gas Data  International Data Rescue Award in the Geosciences  Sponsored by Elsevier Research Data division  Awared 2013 (at AGU FM) and 2015 (at EGU GA)  Competition for 2016 starting soon  Special Issue of GeoResJ on Data Rescue (volume 6, 2015) 44
  44. 44. EarthChem Portal 45
  45. 45. Data Analysis 46
  46. 46. 47
  47. 47. Data Analysis 48
  48. 48. Interoperability with LEPR (M. Ghiroso) 49
  49. 49. Results at LEPR 50
  50. 50. Data Analysis 51
  51. 51. 52
  52. 52. 53 EarthCube  Advances coordination, collaboration, and integration  Community governance  Integrative Activities  Fosters new data communities  Research Coordination Networks  Develops and adapts new technologies to structure, transform, integrate, document, harmonize data & metadata  Building Blocks 53

×