SlideShare a Scribd company logo
1 of 29
Download to read offline
Why quality control and quality
assurance is important for the legacy
of GEOTRACES through its database?
Adam Leadbetter (alead@bodc.ac.uk), British Oceanographic Data Centre
Outline
- Data matter!
- Why compatible data?
- The Geotraces database
- A data intensive future…
1. Data Matter!
Data matter!
“A scholar’s positive
contribution is measured by
the sum of the original data
that he contributes.
Hypotheses come and go but
data remain.”
Santiago Ramón y Cajal
(Nobel Prize winner,1906) in
Advice to a Young Investigator (1897)
Data matter!
“You are not finished until you have
done the research, published the
results, and published the data,
receiving formal credit for everything.
Preserve or Perish”
Mark Parsons
US National Snow and Ice Data Center
Data Management for the International Polar Year (2006)
2. Why compatible data?
Why compatible data?
“If HTML and the [World Wide] Web
made all the online documents look like
one huge book, [compatibility] will
make all the data in the world look like
one huge database.”
Sir Tim Berners-Lee
W3C
Weaving the Web (1999)
Why compatible data?
Why compatible data?
- The Linked Data cloud is built on compatible data
- Similarly, Geotraces db builds on compatible data
- How?
Why compatible data?
- Intercalibration for QC / QA
- Only on the legacy database
- A distinction must be made where IC has not
happened
- May be older “compliant data” which does not meet
standards
Why compatible data?
- Standards
- Metadata
- Bottle – type & make
- Filter – type & make
- Analytical method
- Parameter codes
- Allows data merging & long-term data archiving
Why compatible data?
- Merging
- Allows easy management of “crossover stations”
- Marked as “fixed stations” in the db
- Enables comparison of data between cruises
Why compatible data?
- Mantra
“To make the data accessible and
usable in 5, 10, 30… years time
without the need to contact the
data originator.”
3. The GeoTraces database
http://www.bodc.ac.uk/geotraces/
The GeoTraces database
- Key parameters
Trace elements
Stable isotopes
Radioactive isotopes
Radiogenic isotopes
Others to allow future work to be done
- Supporting parameters
Salinity, Temperature, O2, nutrients
http://www.bodc.ac.uk/geotraces/
The GeoTraces database
- 2014: Intermediate data product
- It will only include
- Submitted data (get your data in by 2013)
- Intercalibrated data
- Data passed by the IC committee
The GeoTraces database
- 2014: Intermediate data product
- It will only include
- Submitted data (get your data in by 2013)
- Intercalibrated data
- Data passed by the IC committee
4. A data intensive future
A data intensive future

“We know more than we can tell.”
Michael Polanyi
Fellow of the Royal Society
The Tacit Dimension (1967)
A data intensive future
Producers

Consumers
Experience

Data

Creation
Gathering

Information

Presentation
Organization

Knowledge

Integration
Conversation

Context
A data intensive future
Induction
Theory

Tentative hyp.

Pattern

Observation
A data intensive future
Induction
Theory

Tentative hyp.

Pattern

Observation

Deduction
Theory

Hypothesis

Observation
Confirmation
A data intensive future
Abduction
Is a method of logical inference introduced by
C. S. Peirce which comes prior to induction and
deduction for which the colloquial name is to
have a "hunch”
A data intensive future
Abduction
Is a method of logical inference introduced by
C. S. Peirce which comes prior to induction and
deduction for which the colloquial name is to
have a "hunch”
• Starts when an inquirer considers of a set of
seemingly unrelated facts
• armed with an intuition that they are somehow
connected and …
• But data intensive!!
• And this can be a job for visualization!!!
Conclusions
- Data matter – and increasingly so!
- The GeoTraces data assembly centre aids in making
data compatible
- The GeoTraces database will be a big legacy
- Who knows how it may end up being used?
Conclusions
- Low quality data have higher costs
- High quality data require communication
- Need a planned QA & QC strategy
- Investment in training
- Best practices
- Use appropriate tooling
- Extensive metadata to prevent “data entropy”

Robinson, Meyer & Lenhardt (2012). Eos 93(19), 189
Thank you

alead@bodc.ac.uk, @AdamLeadbetter

More Related Content

What's hot

DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE
 
2013 DataCite Summer Meeting - Elsevier's program to support research data (H...
2013 DataCite Summer Meeting - Elsevier's program to support research data (H...2013 DataCite Summer Meeting - Elsevier's program to support research data (H...
2013 DataCite Summer Meeting - Elsevier's program to support research data (H...datacite
 
Data Discoverability and Persistent Identifiers - EUDAT Summer School (Chris...
Data Discoverability and Persistent Identifiers - EUDAT Summer School  (Chris...Data Discoverability and Persistent Identifiers - EUDAT Summer School  (Chris...
Data Discoverability and Persistent Identifiers - EUDAT Summer School (Chris...EUDAT
 
EDI Training Module 12: Learn to Cite and Link Your Data
EDI Training Module 12:  Learn to Cite and Link Your DataEDI Training Module 12:  Learn to Cite and Link Your Data
EDI Training Module 12: Learn to Cite and Link Your DataEnvironmental Data Initiative
 
EDI Training Module 4: Organizing Data Into Publishable Units
EDI Training Module 4: Organizing Data Into Publishable UnitsEDI Training Module 4: Organizing Data Into Publishable Units
EDI Training Module 4: Organizing Data Into Publishable UnitsEnvironmental Data Initiative
 
Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...Greg Landrum
 
EDI Training Module 10: EDI Data Repository Overview
EDI Training Module 10:  EDI Data Repository OverviewEDI Training Module 10:  EDI Data Repository Overview
EDI Training Module 10: EDI Data Repository OverviewEnvironmental Data Initiative
 
20160301 23 Research Data Things
20160301 23 Research Data Things20160301 23 Research Data Things
20160301 23 Research Data ThingsKatina Toufexis
 
Smith RDAP11 NSF Data Management Plan Case Studies
Smith RDAP11 NSF Data Management Plan Case StudiesSmith RDAP11 NSF Data Management Plan Case Studies
Smith RDAP11 NSF Data Management Plan Case StudiesASIS&T
 
Global registries initiative frumkin omodei
Global registries initiative frumkin omodeiGlobal registries initiative frumkin omodei
Global registries initiative frumkin omodeiASIS&T
 
Scott Edmunds slides from #IDCC13 Data Science session
Scott Edmunds slides from #IDCC13 Data Science sessionScott Edmunds slides from #IDCC13 Data Science session
Scott Edmunds slides from #IDCC13 Data Science sessionGigaScience, BGI Hong Kong
 
From Vaccine Management to ICU Planning: How CRISP Unlocked the Power of Data...
From Vaccine Management to ICU Planning: How CRISP Unlocked the Power of Data...From Vaccine Management to ICU Planning: How CRISP Unlocked the Power of Data...
From Vaccine Management to ICU Planning: How CRISP Unlocked the Power of Data...Databricks
 
Altman RDAP11 Policy-based Data Management
Altman RDAP11 Policy-based Data ManagementAltman RDAP11 Policy-based Data Management
Altman RDAP11 Policy-based Data ManagementASIS&T
 
Data Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim ClarkData Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim Clarkdatascienceiqss
 
EDI Training Module 12: An Introduction to Metadata and Data Repositories
EDI Training Module 12:  An Introduction to Metadata and Data RepositoriesEDI Training Module 12:  An Introduction to Metadata and Data Repositories
EDI Training Module 12: An Introduction to Metadata and Data RepositoriesEnvironmental Data Initiative
 
How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace Mohamadreza Mohtat
 
Standardising research data policies, research data network
Standardising research data policies, research data networkStandardising research data policies, research data network
Standardising research data policies, research data networkJisc RDM
 
Is one enough? Data warehousing for biomedical research
Is one enough? Data warehousing for biomedical researchIs one enough? Data warehousing for biomedical research
Is one enough? Data warehousing for biomedical researchGreg Landrum
 
Managing your data paget
Managing your data pagetManaging your data paget
Managing your data pagetTERN Australia
 
PA webinar on benefits & costs of FAIR implementation in life sciences
PA webinar on benefits & costs of FAIR implementation in life sciences PA webinar on benefits & costs of FAIR implementation in life sciences
PA webinar on benefits & costs of FAIR implementation in life sciences Pistoia Alliance
 

What's hot (20)

DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?
 
2013 DataCite Summer Meeting - Elsevier's program to support research data (H...
2013 DataCite Summer Meeting - Elsevier's program to support research data (H...2013 DataCite Summer Meeting - Elsevier's program to support research data (H...
2013 DataCite Summer Meeting - Elsevier's program to support research data (H...
 
Data Discoverability and Persistent Identifiers - EUDAT Summer School (Chris...
Data Discoverability and Persistent Identifiers - EUDAT Summer School  (Chris...Data Discoverability and Persistent Identifiers - EUDAT Summer School  (Chris...
Data Discoverability and Persistent Identifiers - EUDAT Summer School (Chris...
 
EDI Training Module 12: Learn to Cite and Link Your Data
EDI Training Module 12:  Learn to Cite and Link Your DataEDI Training Module 12:  Learn to Cite and Link Your Data
EDI Training Module 12: Learn to Cite and Link Your Data
 
EDI Training Module 4: Organizing Data Into Publishable Units
EDI Training Module 4: Organizing Data Into Publishable UnitsEDI Training Module 4: Organizing Data Into Publishable Units
EDI Training Module 4: Organizing Data Into Publishable Units
 
Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...Is that a scientific report or just some cool pictures from the lab? Reproduc...
Is that a scientific report or just some cool pictures from the lab? Reproduc...
 
EDI Training Module 10: EDI Data Repository Overview
EDI Training Module 10:  EDI Data Repository OverviewEDI Training Module 10:  EDI Data Repository Overview
EDI Training Module 10: EDI Data Repository Overview
 
20160301 23 Research Data Things
20160301 23 Research Data Things20160301 23 Research Data Things
20160301 23 Research Data Things
 
Smith RDAP11 NSF Data Management Plan Case Studies
Smith RDAP11 NSF Data Management Plan Case StudiesSmith RDAP11 NSF Data Management Plan Case Studies
Smith RDAP11 NSF Data Management Plan Case Studies
 
Global registries initiative frumkin omodei
Global registries initiative frumkin omodeiGlobal registries initiative frumkin omodei
Global registries initiative frumkin omodei
 
Scott Edmunds slides from #IDCC13 Data Science session
Scott Edmunds slides from #IDCC13 Data Science sessionScott Edmunds slides from #IDCC13 Data Science session
Scott Edmunds slides from #IDCC13 Data Science session
 
From Vaccine Management to ICU Planning: How CRISP Unlocked the Power of Data...
From Vaccine Management to ICU Planning: How CRISP Unlocked the Power of Data...From Vaccine Management to ICU Planning: How CRISP Unlocked the Power of Data...
From Vaccine Management to ICU Planning: How CRISP Unlocked the Power of Data...
 
Altman RDAP11 Policy-based Data Management
Altman RDAP11 Policy-based Data ManagementAltman RDAP11 Policy-based Data Management
Altman RDAP11 Policy-based Data Management
 
Data Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim ClarkData Citation Implementation Guidelines By Tim Clark
Data Citation Implementation Guidelines By Tim Clark
 
EDI Training Module 12: An Introduction to Metadata and Data Repositories
EDI Training Module 12:  An Introduction to Metadata and Data RepositoriesEDI Training Module 12:  An Introduction to Metadata and Data Repositories
EDI Training Module 12: An Introduction to Metadata and Data Repositories
 
How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace
 
Standardising research data policies, research data network
Standardising research data policies, research data networkStandardising research data policies, research data network
Standardising research data policies, research data network
 
Is one enough? Data warehousing for biomedical research
Is one enough? Data warehousing for biomedical researchIs one enough? Data warehousing for biomedical research
Is one enough? Data warehousing for biomedical research
 
Managing your data paget
Managing your data pagetManaging your data paget
Managing your data paget
 
PA webinar on benefits & costs of FAIR implementation in life sciences
PA webinar on benefits & costs of FAIR implementation in life sciences PA webinar on benefits & costs of FAIR implementation in life sciences
PA webinar on benefits & costs of FAIR implementation in life sciences
 

Similar to Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014
Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014
Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014StampedeCon
 
DataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRefDataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRefCrossref
 
GBIF and reuse of research data, Bergen (2016-12-14)
GBIF and reuse of research data, Bergen (2016-12-14)GBIF and reuse of research data, Bergen (2016-12-14)
GBIF and reuse of research data, Bergen (2016-12-14)Dag Endresen
 
INSPIRE Hackathon Webinar Intro to Linked Data and Semantics
INSPIRE Hackathon Webinar   Intro to Linked Data and SemanticsINSPIRE Hackathon Webinar   Intro to Linked Data and Semantics
INSPIRE Hackathon Webinar Intro to Linked Data and Semanticsplan4all
 
INF2190_W1_2016_public
INF2190_W1_2016_publicINF2190_W1_2016_public
INF2190_W1_2016_publicAttila Barta
 
Visualising the Australian open data and research data landscape
Visualising the Australian open data and research data landscapeVisualising the Australian open data and research data landscape
Visualising the Australian open data and research data landscapeJonathan Yu
 
Australian Open government and research data pilot survey 2017
Australian Open government and research data pilot survey 2017Australian Open government and research data pilot survey 2017
Australian Open government and research data pilot survey 2017Jonathan Yu
 
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award LectureWhy Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award LectureXiaogang (Marshall) Ma
 
In search of lost knowledge: joining the dots with Linked Data
In search of lost knowledge: joining the dots with Linked DataIn search of lost knowledge: joining the dots with Linked Data
In search of lost knowledge: joining the dots with Linked Datajonblower
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactDr. Sunil Kr. Pandey
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so farElena Simperl
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesKrishna Sankar
 
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...datacite
 
What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? Robert Grossman
 
Information Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudInformation Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudDhaval Thakker
 

Similar to Why quality control and quality assurance is important for the legacy of GEOTRACES through its database (20)

Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014
Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014
Big Data Past, Present and Future – Where are we Headed? - StampedeCon 2014
 
BrightTALK - Semantic AI
BrightTALK - Semantic AI BrightTALK - Semantic AI
BrightTALK - Semantic AI
 
DataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRefDataCite: the Perfect Complement to CrossRef
DataCite: the Perfect Complement to CrossRef
 
Big Data
Big Data Big Data
Big Data
 
Presentation_Final
Presentation_FinalPresentation_Final
Presentation_Final
 
GBIF and reuse of research data, Bergen (2016-12-14)
GBIF and reuse of research data, Bergen (2016-12-14)GBIF and reuse of research data, Bergen (2016-12-14)
GBIF and reuse of research data, Bergen (2016-12-14)
 
INSPIRE Hackathon Webinar Intro to Linked Data and Semantics
INSPIRE Hackathon Webinar   Intro to Linked Data and SemanticsINSPIRE Hackathon Webinar   Intro to Linked Data and Semantics
INSPIRE Hackathon Webinar Intro to Linked Data and Semantics
 
INF2190_W1_2016_public
INF2190_W1_2016_publicINF2190_W1_2016_public
INF2190_W1_2016_public
 
Visualising the Australian open data and research data landscape
Visualising the Australian open data and research data landscapeVisualising the Australian open data and research data landscape
Visualising the Australian open data and research data landscape
 
Australian Open government and research data pilot survey 2017
Australian Open government and research data pilot survey 2017Australian Open government and research data pilot survey 2017
Australian Open government and research data pilot survey 2017
 
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award LectureWhy Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
 
In search of lost knowledge: joining the dots with Linked Data
In search of lost knowledge: joining the dots with Linked DataIn search of lost knowledge: joining the dots with Linked Data
In search of lost knowledge: joining the dots with Linked Data
 
2014 aus-agta
2014 aus-agta2014 aus-agta
2014 aus-agta
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
 
DBMS
DBMSDBMS
DBMS
 
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...
2013 DataCite Summer Meeting - DOIs and Supercomputing (Terry Jones - Oak Rid...
 
What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care? What is a Data Commons and Why Should You Care?
What is a Data Commons and Why Should You Care?
 
Information Extraction and Linked Data Cloud
Information Extraction and Linked Data CloudInformation Extraction and Linked Data Cloud
Information Extraction and Linked Data Cloud
 

More from Adam Leadbetter

United by a Common Language
United by a Common LanguageUnited by a Common Language
United by a Common LanguageAdam Leadbetter
 
The Place of Schema.org in Linked Ocean Data
The Place of Schema.org in Linked Ocean DataThe Place of Schema.org in Linked Ocean Data
The Place of Schema.org in Linked Ocean DataAdam Leadbetter
 
Using Erddap as a building block in Ireland's Integrated Digital Ocean
Using Erddap as a building block in Ireland's Integrated Digital OceanUsing Erddap as a building block in Ireland's Integrated Digital Ocean
Using Erddap as a building block in Ireland's Integrated Digital OceanAdam Leadbetter
 
Where Linked Data meets Big Data: Applying standard data models to environmen...
Where Linked Data meets Big Data: Applying standard data models to environmen...Where Linked Data meets Big Data: Applying standard data models to environmen...
Where Linked Data meets Big Data: Applying standard data models to environmen...Adam Leadbetter
 
Managing data for marine sciences
Managing data for marine sciencesManaging data for marine sciences
Managing data for marine sciencesAdam Leadbetter
 
Linked Ocean Data - Exploring connections between marine datasets in a Big Da...
Linked Ocean Data - Exploring connections between marine datasets in a Big Da...Linked Ocean Data - Exploring connections between marine datasets in a Big Da...
Linked Ocean Data - Exploring connections between marine datasets in a Big Da...Adam Leadbetter
 
Linking Open Data in Ireland's Digital Ocean
Linking Open Data in Ireland's Digital OceanLinking Open Data in Ireland's Digital Ocean
Linking Open Data in Ireland's Digital OceanAdam Leadbetter
 
Ocean Data Interoperability Platform - Vocabularies: DOIs for NVS Controlled ...
Ocean Data Interoperability Platform - Vocabularies: DOIs for NVS Controlled ...Ocean Data Interoperability Platform - Vocabularies: DOIs for NVS Controlled ...
Ocean Data Interoperability Platform - Vocabularies: DOIs for NVS Controlled ...Adam Leadbetter
 
Ocean Data Interoperability Platform - Big Data: Velocity
Ocean Data Interoperability Platform - Big Data: VelocityOcean Data Interoperability Platform - Big Data: Velocity
Ocean Data Interoperability Platform - Big Data: VelocityAdam Leadbetter
 
Research Vessel Data Management
Research Vessel Data ManagementResearch Vessel Data Management
Research Vessel Data ManagementAdam Leadbetter
 
Practical solutions to implementing "Born Connected" data systems
Practical solutions to implementing "Born Connected" data systemsPractical solutions to implementing "Born Connected" data systems
Practical solutions to implementing "Born Connected" data systemsAdam Leadbetter
 
Let's talk about data: Citation and publication
Let's talk about data: Citation and publicationLet's talk about data: Citation and publication
Let's talk about data: Citation and publicationAdam Leadbetter
 
Irish Integrated Digital Ocean
Irish Integrated Digital OceanIrish Integrated Digital Ocean
Irish Integrated Digital OceanAdam Leadbetter
 
Ocean Data Interoperability Platform - Big Data - Streams & Workflows
Ocean Data Interoperability Platform - Big Data - Streams & WorkflowsOcean Data Interoperability Platform - Big Data - Streams & Workflows
Ocean Data Interoperability Platform - Big Data - Streams & WorkflowsAdam Leadbetter
 
Ocean Data Interoperability Platform - Big Data
Ocean Data Interoperability Platform - Big DataOcean Data Interoperability Platform - Big Data
Ocean Data Interoperability Platform - Big DataAdam Leadbetter
 
Where did my layer come from? The semantics of data release
Where did my layer come from? The semantics of data releaseWhere did my layer come from? The semantics of data release
Where did my layer come from? The semantics of data releaseAdam Leadbetter
 
Vocabulary Services in EMODNet and SeaDataNet
Vocabulary Services in EMODNet and SeaDataNetVocabulary Services in EMODNet and SeaDataNet
Vocabulary Services in EMODNet and SeaDataNetAdam Leadbetter
 

More from Adam Leadbetter (20)

United by a Common Language
United by a Common LanguageUnited by a Common Language
United by a Common Language
 
The Place of Schema.org in Linked Ocean Data
The Place of Schema.org in Linked Ocean DataThe Place of Schema.org in Linked Ocean Data
The Place of Schema.org in Linked Ocean Data
 
Using Erddap as a building block in Ireland's Integrated Digital Ocean
Using Erddap as a building block in Ireland's Integrated Digital OceanUsing Erddap as a building block in Ireland's Integrated Digital Ocean
Using Erddap as a building block in Ireland's Integrated Digital Ocean
 
Where Linked Data meets Big Data: Applying standard data models to environmen...
Where Linked Data meets Big Data: Applying standard data models to environmen...Where Linked Data meets Big Data: Applying standard data models to environmen...
Where Linked Data meets Big Data: Applying standard data models to environmen...
 
Managing data for marine sciences
Managing data for marine sciencesManaging data for marine sciences
Managing data for marine sciences
 
Linked Ocean Data - Exploring connections between marine datasets in a Big Da...
Linked Ocean Data - Exploring connections between marine datasets in a Big Da...Linked Ocean Data - Exploring connections between marine datasets in a Big Da...
Linked Ocean Data - Exploring connections between marine datasets in a Big Da...
 
Connected Ocean Data
Connected Ocean DataConnected Ocean Data
Connected Ocean Data
 
Linking Open Data in Ireland's Digital Ocean
Linking Open Data in Ireland's Digital OceanLinking Open Data in Ireland's Digital Ocean
Linking Open Data in Ireland's Digital Ocean
 
Ocean Data Interoperability Platform - Vocabularies: DOIs for NVS Controlled ...
Ocean Data Interoperability Platform - Vocabularies: DOIs for NVS Controlled ...Ocean Data Interoperability Platform - Vocabularies: DOIs for NVS Controlled ...
Ocean Data Interoperability Platform - Vocabularies: DOIs for NVS Controlled ...
 
Ocean Data Interoperability Platform - Big Data: Velocity
Ocean Data Interoperability Platform - Big Data: VelocityOcean Data Interoperability Platform - Big Data: Velocity
Ocean Data Interoperability Platform - Big Data: Velocity
 
Research Vessel Data Management
Research Vessel Data ManagementResearch Vessel Data Management
Research Vessel Data Management
 
Practical solutions to implementing "Born Connected" data systems
Practical solutions to implementing "Born Connected" data systemsPractical solutions to implementing "Born Connected" data systems
Practical solutions to implementing "Born Connected" data systems
 
Linked Ocean Data
Linked Ocean DataLinked Ocean Data
Linked Ocean Data
 
Let's talk about data: Citation and publication
Let's talk about data: Citation and publicationLet's talk about data: Citation and publication
Let's talk about data: Citation and publication
 
Linked Ocean Data
Linked Ocean DataLinked Ocean Data
Linked Ocean Data
 
Irish Integrated Digital Ocean
Irish Integrated Digital OceanIrish Integrated Digital Ocean
Irish Integrated Digital Ocean
 
Ocean Data Interoperability Platform - Big Data - Streams & Workflows
Ocean Data Interoperability Platform - Big Data - Streams & WorkflowsOcean Data Interoperability Platform - Big Data - Streams & Workflows
Ocean Data Interoperability Platform - Big Data - Streams & Workflows
 
Ocean Data Interoperability Platform - Big Data
Ocean Data Interoperability Platform - Big DataOcean Data Interoperability Platform - Big Data
Ocean Data Interoperability Platform - Big Data
 
Where did my layer come from? The semantics of data release
Where did my layer come from? The semantics of data releaseWhere did my layer come from? The semantics of data release
Where did my layer come from? The semantics of data release
 
Vocabulary Services in EMODNet and SeaDataNet
Vocabulary Services in EMODNet and SeaDataNetVocabulary Services in EMODNet and SeaDataNet
Vocabulary Services in EMODNet and SeaDataNet
 

Recently uploaded

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 

Recently uploaded (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 

Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

  • 1. Why quality control and quality assurance is important for the legacy of GEOTRACES through its database? Adam Leadbetter (alead@bodc.ac.uk), British Oceanographic Data Centre
  • 2. Outline - Data matter! - Why compatible data? - The Geotraces database - A data intensive future…
  • 4. Data matter! “A scholar’s positive contribution is measured by the sum of the original data that he contributes. Hypotheses come and go but data remain.” Santiago Ramón y Cajal (Nobel Prize winner,1906) in Advice to a Young Investigator (1897)
  • 5. Data matter! “You are not finished until you have done the research, published the results, and published the data, receiving formal credit for everything. Preserve or Perish” Mark Parsons US National Snow and Ice Data Center Data Management for the International Polar Year (2006)
  • 7. Why compatible data? “If HTML and the [World Wide] Web made all the online documents look like one huge book, [compatibility] will make all the data in the world look like one huge database.” Sir Tim Berners-Lee W3C Weaving the Web (1999)
  • 9. Why compatible data? - The Linked Data cloud is built on compatible data - Similarly, Geotraces db builds on compatible data - How?
  • 10. Why compatible data? - Intercalibration for QC / QA - Only on the legacy database - A distinction must be made where IC has not happened - May be older “compliant data” which does not meet standards
  • 11. Why compatible data? - Standards - Metadata - Bottle – type & make - Filter – type & make - Analytical method - Parameter codes - Allows data merging & long-term data archiving
  • 12. Why compatible data? - Merging - Allows easy management of “crossover stations” - Marked as “fixed stations” in the db - Enables comparison of data between cruises
  • 13. Why compatible data? - Mantra “To make the data accessible and usable in 5, 10, 30… years time without the need to contact the data originator.”
  • 14. 3. The GeoTraces database
  • 15.
  • 17. The GeoTraces database - Key parameters Trace elements Stable isotopes Radioactive isotopes Radiogenic isotopes Others to allow future work to be done - Supporting parameters Salinity, Temperature, O2, nutrients http://www.bodc.ac.uk/geotraces/
  • 18. The GeoTraces database - 2014: Intermediate data product - It will only include - Submitted data (get your data in by 2013) - Intercalibrated data - Data passed by the IC committee
  • 19. The GeoTraces database - 2014: Intermediate data product - It will only include - Submitted data (get your data in by 2013) - Intercalibrated data - Data passed by the IC committee
  • 20. 4. A data intensive future
  • 21. A data intensive future “We know more than we can tell.” Michael Polanyi Fellow of the Royal Society The Tacit Dimension (1967)
  • 22. A data intensive future Producers Consumers Experience Data Creation Gathering Information Presentation Organization Knowledge Integration Conversation Context
  • 23. A data intensive future Induction Theory Tentative hyp. Pattern Observation
  • 24. A data intensive future Induction Theory Tentative hyp. Pattern Observation Deduction Theory Hypothesis Observation Confirmation
  • 25. A data intensive future Abduction Is a method of logical inference introduced by C. S. Peirce which comes prior to induction and deduction for which the colloquial name is to have a "hunch”
  • 26. A data intensive future Abduction Is a method of logical inference introduced by C. S. Peirce which comes prior to induction and deduction for which the colloquial name is to have a "hunch” • Starts when an inquirer considers of a set of seemingly unrelated facts • armed with an intuition that they are somehow connected and … • But data intensive!! • And this can be a job for visualization!!!
  • 27. Conclusions - Data matter – and increasingly so! - The GeoTraces data assembly centre aids in making data compatible - The GeoTraces database will be a big legacy - Who knows how it may end up being used?
  • 28. Conclusions - Low quality data have higher costs - High quality data require communication - Need a planned QA & QC strategy - Investment in training - Best practices - Use appropriate tooling - Extensive metadata to prevent “data entropy” Robinson, Meyer & Lenhardt (2012). Eos 93(19), 189