SlideShare una empresa de Scribd logo
1 de 43
Using e-Infrastructures for
Biodiversity Conservation
Gianpaolo Coro
ISTI-CNR, Pisa, Italy
• Biodiversity and geospatial data
• Trends in biodiversity observations
• Combining species observations
• Combining biodiversity and geospatial data
Module 3 - Outline
D4Science
D4Science is both a Data and a Computational e-Infrastructure
• Used by several Projects: i-Marine, EUBrazil OpenBio, ENVRI;
• Implements the notion of e-Infrastructure as-a-Service: it offers on demand access to
data management services and computational facilities;
• Hosts several VREs for Fisheries Managers, Biologists, Statisticians…and Students.
D4Science - Resources
Large Set of Biodiversity
and Taxonomic Datasets
connected
A Network to
distribute and
access to
Geospatial Data
Distributed Storage
System to store
datasets and
documents
A Social
Network
to share
opinions and
useful news
Algorithms for Biology-
related experiments
• Biodiversity and geospatial data
• Trends in biodiversity observations
• Combining species observations
• Combining biodiversity and geospatial data
Module 3 - Outline
Biodiversity and Geospatial Data
Biodiversity Data Providers
i-Marine hosts biodiversity datasets coming from several data providers:
• Some are remotely accessed and are maintained by the respective owners;
• Other ones are resident in the e-Infrastructure.
Currently, the accessible datasets are:
• Catalogue of Life (CoL)
• Global Biodiversity Information Facility (GBIF),
• Integrated Taxonomic Information System (ITIS),
• Interim Register of Marine and Nonmarine Genera (IRMNG),
• Ocean Biogeographic Information System (OBIS),
• World Register of Marine Species (WoRMS)
• World Register of Deep-Sea Species ( WoRDSS )
Some data providers are collectors of other data providers, but the alignment is not
guaranteed!
The datasets allow to retrieve:
• Occurrence points (presence points or specimen)
• Taxa names
Online Examples:
http://www.catalogueoflife.org/
http://www.gbif.org/
http://www.iobis.org/
Geospatial Data Providers
Bio-ORACLE
NetCDF NetCDF
ASCII
ArcGIS
ASCII Raw formats
World Ocean Atlas
Online Examples:
http://www.myocean.eu
https://www.nodc.noaa.gov/OC5/woa13/
http://www.oracle.ugent.be/
ToolsUI ftp://ftp.unidata.ucar.edu/pub/netcdf-java/v4.5/toolsUI-4.5.jar
• Biodiversity and geospatial data
• Trends in biodiversity observations
• Combining species observations
• Combining biodiversity and geospatial data
Trendylyzer
Trendylyzer allows to
discover species
observation trends.
It is based on the
OBIS collector
OBIS
This trend tells the
story of the
Coelacanth discovery
Online Example:
the i-Marine Trendylyzer
https://i-marine.d4science.org/group/biodiversitylab/trends-production
• Biodiversity and geospatial data
• Trends in biodiversity observations
• Combining species observations
• Combining biodiversity and geospatial data
Cleaning
Union – Difference - Intersection
Occurrences Points Operations
A
x,y
Event Date
Modif Date
Author
Species Scientific
Name
d(x,y) < Distance Thr
=
LD(Author) * LD(SciName) > Lexical Thr
<Take the most recent>
B
x,y
Event Date
Modif Date
Author
Species Scientific
Name
Evaluate
Experiment
Solea solea
57 085 Records2 324 Records
1 871 Records
10 542 Records
Duplicates Deletion
with Exact Match
(DThr=0; LThr=1)
Subtraction
DThr=0.01; LThr=0 DThr=0.01; LThr=1
DThr=0.0001;
LThr=0.8
183 Records 0 Records 0 Records
Main remarks:
• The “recordedBy” fields contain
differences in names formats
• The Scientific Names fields are
different (names vs names and
codes)
• D4Science helps in collecting a
larger number of Solea solea
unique occurrence records
• Even if GBIF collects data from
OBIS, the coverage is not updated
Occurrences Points Operations
Occurrences Duplicates Deleter:
An algorithm for deleting similar occurrences in a sets of occurrence points coming from the
Species Discovery Facility of D4Science.
A
Occurrences Points Operations
Occurrences Intersection:
Between two Ocurrence Sets A and B, keeps the elements of the B that are similar to elements
in A.
A B
Occurrences Points Operations
Occurrences Subtraction:
Between two Ocurrence Sets A and B, keeps the elements of the A that are not similar
to any element in B
A B
Occurrences Points Operations
Occurrences Merger:
Between two Ocurrence Sets A and B, enriches A with the elements of B that are not in the A.
Updates the elements of the A with more recent elements in B. If one element in A corresponds
to several recent elements in B, these are substituted to the element of A.
A
B
Online experiments:
the i-Marine
Occurrence Management system
https://i-marine.d4science.org/group/biodiversitylab/processing-tools
• Biodiversity and geospatial data
• Trends in biodiversity observations
• Combining species observations
• Combining biodiversity and geospatial data
Module 3 - Outline
Combining Biodiversity and Geospatial data
Environmental layers
Species occurrence dataset
Enriched dataset
Online Experiments:
https://i-marine.d4science.org/group/biodiversitylab/processing-tools
One practical application
The giant squid - Architeuthis
16th century 2012
The giant squid (Architeuthis) has been reported worldwide even before the
16th century, and has recently been observed live in its habitat for the first
time.
Why rare species?
• Biological and evolutionary investigations
• Fisheries management policies and conservation
• Vulnerable Marine Ecosystems
• Key role in affecting biodiversity richness
• Indicators of degradation for aquatic ecosystems
Detecting rare species
• How to build a reliable distribution from few
observations?
• How to account for absence
locations?
• Is there any approach for
rare species?
Data quality
For rare species, data quality is fundamental:
• Reliable presence data
• Reliable absence locations
• High quality environmental features
• Non-noisy environmental features
Tools – i-marine.d4science.org
D4Science e-Infrastructure:
• Retrieve presence data
• Generate absence data
• Get environmental data
• Model, adjust data and
produce maps
• Share results
1. Presence data of A. dux from D4S
https://i-marine.d4science.org/group/biodiversitylab/species-data-discovery
2. Simulating A. dux absence locations from AquaMaps
https://i-marine.d4science.org/group/biodiversitylab/processing-tools
0<Prob. < 0.2AquaMaps Native
3. Environmental Features
https://i-
marine.d4science.org/group/biodiversitylab/ge
o-visualisation
https://i-
marine.d4science.org/group/biodiversitylab/pr
ocessing-tools
Most of these layers were
available in D4Science
Depth and Distance from land
were imported using the
Statistical Manager
4. MaxEnt model as filter
https://i-marine.d4science.org/group/biodiversitylab/processing-tools
MaxEnt
Env. features most
correlated to the giant
squid
Presence data
Env. data
Filtered Environmental Features
5. Presence/absence modelling:
Artificial Neural Networks (ANN)
Model trained on positive
and negative examples
In terms of env. features
Binary file
https://i-marine.d4science.org/group/biodiversitylab/processing-tools
Presence/absence data
Filtered env. features
6. Projection of the Neural Network
https://i-marine.d4science.org/group/biodiversitylab/processing-tools
7. Comparison
MaxEnt
(presence-only)
22.01% 21.68%
Similarity calculated using Maps
Comparison,
by Coro, Ellenbroek, Pagano
DOI: 10.1080/15481603.2014.959391
Expert map,
Nesis, 2003
Aquamaps
Suitable
(expert system)
Neural Network
(presence/absence)
42.83%
https://i-
marine.d4science.org/group/bio
diversitylab/processing-tools
Conclusions
• Using data quality enhancement produces high performance
distribution
• A presence/absence ANN combines these data
• Biological, observation and expert evidence confirm the prediction
by the ANN
Summary: modelling rare species
distributions
1. Retrieve high quality presence locations by relying on the metadata of the records,
2. Use expert knowledge or an expert system to detect absence locations.
Select absence locations as widespread as possible,
3. Select a number of environmental characteristics correlated to the species presence,
4. Use MaxEnt to filter the environmental characteristics that are really important with
respect to the presence points,
5. Train an Artificial Neural Network on presence and absence locations and select the best
learning topology,
6. Project the ANN at global scale, using the a resolution equal to the maximum in the
environmental features,
7. Train a MaxEnt model as comparison system.
Just another example
Coelacanth, Smith 1939
GARP
MaxEnt
AquaMaps
Neural Network
Coro, Gianpaolo, Pasquale Pagano, and Anton Ellenbroek.
"Combining simulated expert knowledge with Neural
Networks to produce Ecological Niche Models for Latimeria
chalumnae." Ecological Modelling 268 (2013): 55-63.

Más contenido relacionado

La actualidad más candente

AURIN - Overview
AURIN - OverviewAURIN - Overview
AURIN - OverviewARDC
 
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...TERN Australia
 
E biothon workshop 2014 04 15 v1
E biothon workshop 2014 04 15 v1E biothon workshop 2014 04 15 v1
E biothon workshop 2014 04 15 v1Vincent Breton
 
LiSIs Poster Presentation
LiSIs Poster PresentationLiSIs Poster Presentation
LiSIs Poster PresentationChristos Kannas
 
The Internet of Samples: IGSN in Action
The Internet of Samples: IGSN in ActionThe Internet of Samples: IGSN in Action
The Internet of Samples: IGSN in ActionKerstin Lehnert
 
The Biodiversity Informatics Landscape
The Biodiversity Informatics LandscapeThe Biodiversity Informatics Landscape
The Biodiversity Informatics LandscapeVince Smith
 
GBIF and Biodiversity informatics for museums, 15 March 2021
GBIF and Biodiversity informatics for museums, 15 March 2021GBIF and Biodiversity informatics for museums, 15 March 2021
GBIF and Biodiversity informatics for museums, 15 March 2021Dag Endresen
 
Web services for sharing germplasm data sets, at FAO in Rome (2006)
Web services for sharing germplasm data sets, at FAO in Rome (2006)Web services for sharing germplasm data sets, at FAO in Rome (2006)
Web services for sharing germplasm data sets, at FAO in Rome (2006)Dag Endresen
 
Session 06, Introduction to biodiversity sample-based data publishing at the ...
Session 06, Introduction to biodiversity sample-based data publishing at the ...Session 06, Introduction to biodiversity sample-based data publishing at the ...
Session 06, Introduction to biodiversity sample-based data publishing at the ...Alberto González-Talaván
 
Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...
Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...
Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...Kerstin Lehnert
 
IGSN: The International Geo Sample Number (DFG Roundtable)
IGSN: The International Geo Sample Number (DFG Roundtable)IGSN: The International Geo Sample Number (DFG Roundtable)
IGSN: The International Geo Sample Number (DFG Roundtable)Kerstin Lehnert
 
AH-XLDBEurope-position-09 jun2011
AH-XLDBEurope-position-09 jun2011AH-XLDBEurope-position-09 jun2011
AH-XLDBEurope-position-09 jun2011Alex Hardisty
 
GBIF towards 2030 (November 2018)
GBIF towards 2030 (November 2018)GBIF towards 2030 (November 2018)
GBIF towards 2030 (November 2018)Dag Endresen
 
Global Biodiversity Information Facility - 2013
Global Biodiversity Information Facility - 2013Global Biodiversity Information Facility - 2013
Global Biodiversity Information Facility - 2013Dag Endresen
 
FAIR and open biodiversity collection data management
FAIR and open biodiversity collection data managementFAIR and open biodiversity collection data management
FAIR and open biodiversity collection data managementDag Endresen
 
Museum collections as research data - October 2019
Museum collections as research data - October 2019Museum collections as research data - October 2019
Museum collections as research data - October 2019Dag Endresen
 
Liu Yu phenology observations of CERN
Liu Yu phenology observations of CERNLiu Yu phenology observations of CERN
Liu Yu phenology observations of CERNAlison Specht
 

La actualidad más candente (20)

AURIN - Overview
AURIN - OverviewAURIN - Overview
AURIN - Overview
 
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
 
E biothon workshop 2014 04 15 v1
E biothon workshop 2014 04 15 v1E biothon workshop 2014 04 15 v1
E biothon workshop 2014 04 15 v1
 
LiSIs Poster Presentation
LiSIs Poster PresentationLiSIs Poster Presentation
LiSIs Poster Presentation
 
EMBL-ABR_ AGRF2016
EMBL-ABR_ AGRF2016EMBL-ABR_ AGRF2016
EMBL-ABR_ AGRF2016
 
The Internet of Samples: IGSN in Action
The Internet of Samples: IGSN in ActionThe Internet of Samples: IGSN in Action
The Internet of Samples: IGSN in Action
 
The Biodiversity Informatics Landscape
The Biodiversity Informatics LandscapeThe Biodiversity Informatics Landscape
The Biodiversity Informatics Landscape
 
GBIF and Biodiversity informatics for museums, 15 March 2021
GBIF and Biodiversity informatics for museums, 15 March 2021GBIF and Biodiversity informatics for museums, 15 March 2021
GBIF and Biodiversity informatics for museums, 15 March 2021
 
Web services for sharing germplasm data sets, at FAO in Rome (2006)
Web services for sharing germplasm data sets, at FAO in Rome (2006)Web services for sharing germplasm data sets, at FAO in Rome (2006)
Web services for sharing germplasm data sets, at FAO in Rome (2006)
 
Session 06, Introduction to biodiversity sample-based data publishing at the ...
Session 06, Introduction to biodiversity sample-based data publishing at the ...Session 06, Introduction to biodiversity sample-based data publishing at the ...
Session 06, Introduction to biodiversity sample-based data publishing at the ...
 
Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...
Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...
Advancing Reproducible Science from Physical Samples: The IGSN and the iSampl...
 
IGSN: The International Geo Sample Number (DFG Roundtable)
IGSN: The International Geo Sample Number (DFG Roundtable)IGSN: The International Geo Sample Number (DFG Roundtable)
IGSN: The International Geo Sample Number (DFG Roundtable)
 
AH-XLDBEurope-position-09 jun2011
AH-XLDBEurope-position-09 jun2011AH-XLDBEurope-position-09 jun2011
AH-XLDBEurope-position-09 jun2011
 
GBIF towards 2030 (November 2018)
GBIF towards 2030 (November 2018)GBIF towards 2030 (November 2018)
GBIF towards 2030 (November 2018)
 
Global Biodiversity Information Facility - 2013
Global Biodiversity Information Facility - 2013Global Biodiversity Information Facility - 2013
Global Biodiversity Information Facility - 2013
 
Phenotyping in Breeding Programs for biotic stresses
Phenotyping in Breeding Programs for biotic stresses Phenotyping in Breeding Programs for biotic stresses
Phenotyping in Breeding Programs for biotic stresses
 
FAIR and open biodiversity collection data management
FAIR and open biodiversity collection data managementFAIR and open biodiversity collection data management
FAIR and open biodiversity collection data management
 
Ecosystem Services Mapping as a Framework for Integrated Natural Resource Man...
Ecosystem Services Mapping as a Framework for Integrated Natural Resource Man...Ecosystem Services Mapping as a Framework for Integrated Natural Resource Man...
Ecosystem Services Mapping as a Framework for Integrated Natural Resource Man...
 
Museum collections as research data - October 2019
Museum collections as research data - October 2019Museum collections as research data - October 2019
Museum collections as research data - October 2019
 
Liu Yu phenology observations of CERN
Liu Yu phenology observations of CERNLiu Yu phenology observations of CERN
Liu Yu phenology observations of CERN
 

Similar a USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 3

Using e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity ConservationUsing e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity ConservationBlue BRIDGE
 
Discovering the impact of climate change on the marine species, Aquamaps
Discovering the impact of climate change on the marine species, AquamapsDiscovering the impact of climate change on the marine species, Aquamaps
Discovering the impact of climate change on the marine species, AquamapsiMarine283644
 
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 1
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 1USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 1
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 1Gianpaolo Coro
 
Biodiversity Virtual e-Laboratory (BioVeL)
Biodiversity Virtual e-Laboratory (BioVeL)Biodiversity Virtual e-Laboratory (BioVeL)
Biodiversity Virtual e-Laboratory (BioVeL)Alex Hardisty
 
Building on iMarine for fostering Innovation, Decision making, Governance and...
Building on iMarine for fostering Innovation, Decision making, Governance and...Building on iMarine for fostering Innovation, Decision making, Governance and...
Building on iMarine for fostering Innovation, Decision making, Governance and...Blue BRIDGE
 
2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptxvijayapraba1
 
10th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v210th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v2Alex Hardisty
 
Trait data mining using FIGS (2006)
Trait data mining using FIGS (2006)Trait data mining using FIGS (2006)
Trait data mining using FIGS (2006)Dag Endresen
 
De-centralized but global: Redesigning biodiversity data aggregation for impr...
De-centralized but global: Redesigning biodiversity data aggregation for impr...De-centralized but global: Redesigning biodiversity data aggregation for impr...
De-centralized but global: Redesigning biodiversity data aggregation for impr...taxonbytes
 
Supporting the research lifecycle of geo-GSNL initiative through HPC and Rese...
Supporting the research lifecycle of geo-GSNL initiative through HPC and Rese...Supporting the research lifecycle of geo-GSNL initiative through HPC and Rese...
Supporting the research lifecycle of geo-GSNL initiative through HPC and Rese...Raul Palma
 
The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18Dag Endresen
 
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 4
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 4USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 4
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 4Gianpaolo Coro
 
Perth ausplots presentation_070616_internet_qu
Perth ausplots presentation_070616_internet_quPerth ausplots presentation_070616_internet_qu
Perth ausplots presentation_070616_internet_qubensparrowau
 
Identifying and Linking Physical Samples with Data: Using IGSN
Identifying and Linking Physical Samples with Data: Using IGSNIdentifying and Linking Physical Samples with Data: Using IGSN
Identifying and Linking Physical Samples with Data: Using IGSNARDC
 
eMonocot Project Update
eMonocot Project UpdateeMonocot Project Update
eMonocot Project UpdateeMonocot
 

Similar a USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 3 (20)

Using e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity ConservationUsing e-Infrastructures for Biodiversity Conservation
Using e-Infrastructures for Biodiversity Conservation
 
Discovering the impact of climate change on the marine species, Aquamaps
Discovering the impact of climate change on the marine species, AquamapsDiscovering the impact of climate change on the marine species, Aquamaps
Discovering the impact of climate change on the marine species, Aquamaps
 
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 1
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 1USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 1
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 1
 
Biodiversity Virtual e-Laboratory (BioVeL)
Biodiversity Virtual e-Laboratory (BioVeL)Biodiversity Virtual e-Laboratory (BioVeL)
Biodiversity Virtual e-Laboratory (BioVeL)
 
Building on iMarine for fostering Innovation, Decision making, Governance and...
Building on iMarine for fostering Innovation, Decision making, Governance and...Building on iMarine for fostering Innovation, Decision making, Governance and...
Building on iMarine for fostering Innovation, Decision making, Governance and...
 
2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx
 
130712 antabif workshop
130712 antabif workshop130712 antabif workshop
130712 antabif workshop
 
10th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v210th e concertation-brussels-06march2013-v2
10th e concertation-brussels-06march2013-v2
 
Trait data mining using FIGS (2006)
Trait data mining using FIGS (2006)Trait data mining using FIGS (2006)
Trait data mining using FIGS (2006)
 
De-centralized but global: Redesigning biodiversity data aggregation for impr...
De-centralized but global: Redesigning biodiversity data aggregation for impr...De-centralized but global: Redesigning biodiversity data aggregation for impr...
De-centralized but global: Redesigning biodiversity data aggregation for impr...
 
Supporting the research lifecycle of geo-GSNL initiative through HPC and Rese...
Supporting the research lifecycle of geo-GSNL initiative through HPC and Rese...Supporting the research lifecycle of geo-GSNL initiative through HPC and Rese...
Supporting the research lifecycle of geo-GSNL initiative through HPC and Rese...
 
The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18
 
GBIF Work Programme 2016 Update
GBIF Work Programme 2016 UpdateGBIF Work Programme 2016 Update
GBIF Work Programme 2016 Update
 
Introduction to OBIS at 2nd Int Ocean Research Conference 2014
Introduction to OBIS at 2nd Int Ocean Research Conference 2014Introduction to OBIS at 2nd Int Ocean Research Conference 2014
Introduction to OBIS at 2nd Int Ocean Research Conference 2014
 
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 4
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 4USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 4
USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 4
 
Currsci Jan10 2003
Currsci Jan10 2003Currsci Jan10 2003
Currsci Jan10 2003
 
Perth ausplots presentation_070616_internet_qu
Perth ausplots presentation_070616_internet_quPerth ausplots presentation_070616_internet_qu
Perth ausplots presentation_070616_internet_qu
 
Identifying and Linking Physical Samples with Data: Using IGSN
Identifying and Linking Physical Samples with Data: Using IGSNIdentifying and Linking Physical Samples with Data: Using IGSN
Identifying and Linking Physical Samples with Data: Using IGSN
 
Blue Skills
Blue SkillsBlue Skills
Blue Skills
 
eMonocot Project Update
eMonocot Project UpdateeMonocot Project Update
eMonocot Project Update
 

Último

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 

Último (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 

USING E-INFRASTRUCTURES FOR BIODIVERSITY CONSERVATION - Module 3

  • 1. Using e-Infrastructures for Biodiversity Conservation Gianpaolo Coro ISTI-CNR, Pisa, Italy
  • 2. • Biodiversity and geospatial data • Trends in biodiversity observations • Combining species observations • Combining biodiversity and geospatial data Module 3 - Outline
  • 3. D4Science D4Science is both a Data and a Computational e-Infrastructure • Used by several Projects: i-Marine, EUBrazil OpenBio, ENVRI; • Implements the notion of e-Infrastructure as-a-Service: it offers on demand access to data management services and computational facilities; • Hosts several VREs for Fisheries Managers, Biologists, Statisticians…and Students.
  • 4. D4Science - Resources Large Set of Biodiversity and Taxonomic Datasets connected A Network to distribute and access to Geospatial Data Distributed Storage System to store datasets and documents A Social Network to share opinions and useful news Algorithms for Biology- related experiments
  • 5. • Biodiversity and geospatial data • Trends in biodiversity observations • Combining species observations • Combining biodiversity and geospatial data Module 3 - Outline
  • 7. Biodiversity Data Providers i-Marine hosts biodiversity datasets coming from several data providers: • Some are remotely accessed and are maintained by the respective owners; • Other ones are resident in the e-Infrastructure. Currently, the accessible datasets are: • Catalogue of Life (CoL) • Global Biodiversity Information Facility (GBIF), • Integrated Taxonomic Information System (ITIS), • Interim Register of Marine and Nonmarine Genera (IRMNG), • Ocean Biogeographic Information System (OBIS), • World Register of Marine Species (WoRMS) • World Register of Deep-Sea Species ( WoRDSS ) Some data providers are collectors of other data providers, but the alignment is not guaranteed! The datasets allow to retrieve: • Occurrence points (presence points or specimen) • Taxa names
  • 9. Geospatial Data Providers Bio-ORACLE NetCDF NetCDF ASCII ArcGIS ASCII Raw formats World Ocean Atlas
  • 11. • Biodiversity and geospatial data • Trends in biodiversity observations • Combining species observations • Combining biodiversity and geospatial data
  • 12. Trendylyzer Trendylyzer allows to discover species observation trends. It is based on the OBIS collector OBIS This trend tells the story of the Coelacanth discovery
  • 13. Online Example: the i-Marine Trendylyzer https://i-marine.d4science.org/group/biodiversitylab/trends-production
  • 14. • Biodiversity and geospatial data • Trends in biodiversity observations • Combining species observations • Combining biodiversity and geospatial data
  • 16. Union – Difference - Intersection
  • 17. Occurrences Points Operations A x,y Event Date Modif Date Author Species Scientific Name d(x,y) < Distance Thr = LD(Author) * LD(SciName) > Lexical Thr <Take the most recent> B x,y Event Date Modif Date Author Species Scientific Name Evaluate
  • 18. Experiment Solea solea 57 085 Records2 324 Records 1 871 Records 10 542 Records Duplicates Deletion with Exact Match (DThr=0; LThr=1) Subtraction DThr=0.01; LThr=0 DThr=0.01; LThr=1 DThr=0.0001; LThr=0.8 183 Records 0 Records 0 Records Main remarks: • The “recordedBy” fields contain differences in names formats • The Scientific Names fields are different (names vs names and codes) • D4Science helps in collecting a larger number of Solea solea unique occurrence records • Even if GBIF collects data from OBIS, the coverage is not updated
  • 19. Occurrences Points Operations Occurrences Duplicates Deleter: An algorithm for deleting similar occurrences in a sets of occurrence points coming from the Species Discovery Facility of D4Science. A
  • 20. Occurrences Points Operations Occurrences Intersection: Between two Ocurrence Sets A and B, keeps the elements of the B that are similar to elements in A. A B
  • 21. Occurrences Points Operations Occurrences Subtraction: Between two Ocurrence Sets A and B, keeps the elements of the A that are not similar to any element in B A B
  • 22. Occurrences Points Operations Occurrences Merger: Between two Ocurrence Sets A and B, enriches A with the elements of B that are not in the A. Updates the elements of the A with more recent elements in B. If one element in A corresponds to several recent elements in B, these are substituted to the element of A. A B
  • 23. Online experiments: the i-Marine Occurrence Management system https://i-marine.d4science.org/group/biodiversitylab/processing-tools
  • 24. • Biodiversity and geospatial data • Trends in biodiversity observations • Combining species observations • Combining biodiversity and geospatial data Module 3 - Outline
  • 25. Combining Biodiversity and Geospatial data Environmental layers Species occurrence dataset Enriched dataset
  • 28. The giant squid - Architeuthis 16th century 2012 The giant squid (Architeuthis) has been reported worldwide even before the 16th century, and has recently been observed live in its habitat for the first time.
  • 29. Why rare species? • Biological and evolutionary investigations • Fisheries management policies and conservation • Vulnerable Marine Ecosystems • Key role in affecting biodiversity richness • Indicators of degradation for aquatic ecosystems
  • 30. Detecting rare species • How to build a reliable distribution from few observations? • How to account for absence locations? • Is there any approach for rare species?
  • 31. Data quality For rare species, data quality is fundamental: • Reliable presence data • Reliable absence locations • High quality environmental features • Non-noisy environmental features
  • 32. Tools – i-marine.d4science.org D4Science e-Infrastructure: • Retrieve presence data • Generate absence data • Get environmental data • Model, adjust data and produce maps • Share results
  • 33. 1. Presence data of A. dux from D4S https://i-marine.d4science.org/group/biodiversitylab/species-data-discovery
  • 34. 2. Simulating A. dux absence locations from AquaMaps https://i-marine.d4science.org/group/biodiversitylab/processing-tools 0<Prob. < 0.2AquaMaps Native
  • 35. 3. Environmental Features https://i- marine.d4science.org/group/biodiversitylab/ge o-visualisation https://i- marine.d4science.org/group/biodiversitylab/pr ocessing-tools Most of these layers were available in D4Science Depth and Distance from land were imported using the Statistical Manager
  • 36. 4. MaxEnt model as filter https://i-marine.d4science.org/group/biodiversitylab/processing-tools MaxEnt Env. features most correlated to the giant squid Presence data Env. data
  • 38. 5. Presence/absence modelling: Artificial Neural Networks (ANN) Model trained on positive and negative examples In terms of env. features Binary file https://i-marine.d4science.org/group/biodiversitylab/processing-tools Presence/absence data Filtered env. features
  • 39. 6. Projection of the Neural Network https://i-marine.d4science.org/group/biodiversitylab/processing-tools
  • 40. 7. Comparison MaxEnt (presence-only) 22.01% 21.68% Similarity calculated using Maps Comparison, by Coro, Ellenbroek, Pagano DOI: 10.1080/15481603.2014.959391 Expert map, Nesis, 2003 Aquamaps Suitable (expert system) Neural Network (presence/absence) 42.83% https://i- marine.d4science.org/group/bio diversitylab/processing-tools
  • 41. Conclusions • Using data quality enhancement produces high performance distribution • A presence/absence ANN combines these data • Biological, observation and expert evidence confirm the prediction by the ANN
  • 42. Summary: modelling rare species distributions 1. Retrieve high quality presence locations by relying on the metadata of the records, 2. Use expert knowledge or an expert system to detect absence locations. Select absence locations as widespread as possible, 3. Select a number of environmental characteristics correlated to the species presence, 4. Use MaxEnt to filter the environmental characteristics that are really important with respect to the presence points, 5. Train an Artificial Neural Network on presence and absence locations and select the best learning topology, 6. Project the ANN at global scale, using the a resolution equal to the maximum in the environmental features, 7. Train a MaxEnt model as comparison system.
  • 43. Just another example Coelacanth, Smith 1939 GARP MaxEnt AquaMaps Neural Network Coro, Gianpaolo, Pasquale Pagano, and Anton Ellenbroek. "Combining simulated expert knowledge with Neural Networks to produce Ecological Niche Models for Latimeria chalumnae." Ecological Modelling 268 (2013): 55-63.