SlideShare una empresa de Scribd logo
1 de 42
Descargar para leer sin conexión
Making Data FAIR on WikiData - Andra Waagmeester
Structure
● Introduction
○ Wikidata
○ Wikibase
○ Gene Wiki
● Bits and Bolts
○ How to get your data into Wikidata/Wikibase
○ Using Wikidata/Wikibase
● Is Wikidata FAIR?
○ Where is it located in the research life cycle
○ How FAIR is Wikidata/Wikibase/Genewiki
○ interoperate/collaborate/engage
○ one key change to successfully implement FAIR data infrastructure
Wikidata and its sisters
Wikidata is to data as Wikipedia is to text
Wikidata is a collaboratively edited knowledge
base operated by the Wikimedia Foundation
● Completely free, even for commercial usage (CC0)
● Anybody can contribute
● Covers all domains of knowledge
● Extensive item history, talk pages, projects, users
● Integration with the semantic web
● High performance query engine (SPARQL)
● Stable! Long term support not dictated by funding
cycles
● Actively developed
● Already has large number of active users, editors,
contributors!
A giant graph of knowledge!
FAIR on Wikidata The Wikimedia infrastructure
Infrastructure Resource Content Where?
Data
Text
Schema
Under developement
https://www.wikidata.org
https://www.wikibase.org
https://<lang>.wikipedia.org
https://releases.wikimedia.org/mediawiki/
https://shex.io
https://wikidata-shex.wmflabs.org/
FAIR on Wikidata: Providing identifiers
http://ceur-ws.org/Vol-2275/short1.pdf
FAIR on Wikidata: Providing identifiers
Making Data FAIR on WikiData - Andra Waagmeester
Making Data FAIR on WikiData - Andra Waagmeester
Making Data FAIR on WikiData - Andra Waagmeester
Making Data FAIR on WikiData - Andra Waagmeester
Tools using Wikidata
Gene Wiki
The Gene Wiki project, circa 2008
1
4
Huss, PLoS Biol, 2008
Data imported
from structured
databases
Summarized
knowledge via
crowdsourcing
1
5
16
EMA
GWAS Central
NDF-RT
NCBI Gene
InterPro
Guide to Pharmacology
PubChem
Chlambase.org and Wikigenomes.org
FAIRification
Community engagement and model discussion
Formally capture and describe model and community
consensus
Model development
● Legacy review – develop punch lists for existing data issues that
needs fixing
● Documentation – terse, human-readable representation helping
contributers and maintainers quickly grok the model
● Client pre-submission – submitters test their data before submission
to make sure they’re saying what they want to say and that the
receiving schema can accommodate all of their data
● Server pre-ingestion – submission process checks data as it comes
in and either rejects or warns about non-conformant data
● Model structure of items (genes, drugs,
diseases, .. etc) & relationships between
items
● Import data from many sources and
ontologies
● Linked to many identifiers from external
databases
● Architecture for maintaining data from
external sources
Seeding with data
Platforms for (semi-) automatic data ingestion
● Quickstatements: https://tools.wmflabs.org/quickstatements/
● Open Refine: https://www.wikidata.org/wiki/Wikidata:Tools/OpenRefine
● Wikidata Integrator: https://github.com/SuLab/WikidataIntegrator
Wikidataintegrator
Simple data retrieval
24
39 genes
gene geneLabel gene geneLabel gene geneLabel gene geneLabel
Q5013317 COL22A1 Q18027370 IGSF3 Q18053559 CDHR3 Q14903974 SMAD3
Q14912759 SLC22A5 Q18045382 HPSE2 Q18045669 ATG3 Q18033889 IL1RL1
Q14914243 PSAP Q18048437 IL33 Q18035037 RAD50 Q17917202 ERBB4
Q14907990 SLC30A8 Q18051900 PYHIN1 Q18036984 FBXL7 Q18027836 IL6R
Q18025002 GAB1 Q17709208 ACO1 Q18033919 XPR1 Q18030185 NOTCH4
Q18035589 C6orf10 Q18027822 IL2RB Q15326496 RORA Q18030409 PDE4D
Q18054256 GSDMA Q18030364 PBX2 Q18042132 GSDMB Q18045645 IKZF4
Q18058487 C5orf56 Q18037773 ABI3BP Q18029145 MKLN1 Q18039979 KLHL5
Q18030785 PRKG1 Q18039623 CTNNA3 Q18036729 RAP1GAP2 Q18026947 HLA-DQA1
Q18033424 IL18R1 Q18046350 ZNF665 Q14878303 IL13
“Retrieve genes with
GWAS association
with asthma”
http://bit.ly/bosc2017_wikidata
Data integration
25
“Retrieve genes with
GWAS association
with asthma and gene
product is localized to
membrane”
gene geneLabel gene geneLabel gene geneLabel gene geneLabel
Q1491275
9 SLC22A5
Q1802737
0 IGSF3
Q1803503
7 RAD50
Q1802783
6 IL6R
Q1491424
3 PSAP
Q1803342
4 IL18R1
Q1803391
9 XPR1
Q1803040
9 PDE4D
Q1490799
0 SLC30A8
Q1804538
2 HPSE2
Q1804213
2 GSDMB
Q1803018
5 NOTCH4
Q1803558
9 C6orf10
Q1802782
2 IL2RB
Q1803672
9 RAP1GAP2
Q1802694
7 HLA-DQA1
Q1805425
6 GSDMA
Q1805355
9 CDHR3
Q1803388
9 IL1RL1
Q1803078
5 PRKG1
Q1490397
4 SMAD3
Q1791720
2 ERBB4
22 genes
http://bit.ly/bosc2017_wikidata
Leveraging the Disease Ontology structure
26
“Retrieve genes with GWAS
association with any
respiratory disease and
gene product is localized to
membrane (non-IEA)”
31 genes / 8 diseases
diseaseGALabel gene_countsgeneList
asthma 15
SMAD3, RAP1GAP2, IL18R1, HPSE2,
SLC30A8, SLC22A5, PSAP, ERBB4,
HLA-DQA1, IGSF3, IL2RB, IL6R, NOTCH4,
PDE4D, RAD50
chronic obstructive
pulmonary disease 5 HLA-C, SFTPD, ANXA5, ANXA11, ATP2C2
lung cancer 3 TGM5, VTI1A, PHACTR2
interstitial lung disease 2 DSP, ATP11A
non-small-cell lung carcinoma 2 NALCN, DLST
nasopharynx carcinoma 2 ITGA9, TNFRSF19
adenocarcinoma of the lung 1 BTNL2
pulmonary emphysema 1 BICD1
http://bit.ly/bosc2017_wikidata
... and show associated pathways
27
gene pathway
SMAD3 Androgen receptor signaling pathway
SMAD3 TGF-beta Receptor Signaling
SMAD3 mechlorethamine exposure
HLA-C Allograft Rejection
SFTPD
Regulation of toll-like receptor signaling
pathway
…. …..
“Retrieve genes with GWAS
association with any
respiratory disease and
gene product is localized to
membrane (non-IEA), show
causative chemical hazards
and show pathways where
they have a role.”
16 genes / 59 pathways
http://bit.ly/bosc2017_wikidata
Making Data FAIR on WikiData - Andra Waagmeester
Try me....
........
Making Data FAIR on WikiData - Andra Waagmeester
Is Wikidata FAIR?
What makes a service FAIR?
Recognition
Ticking the criteria
(F)indable
(A)ccesible
(I)nteroperable
(R)eusable
Where is Wikimedia ecosystem located in the Research Data
Lifecycle
● Data creation
● Documentation
● Discovery
● Sharing
● Reuse
How do you interoperate/collaborate/engage with
other services to help support FAIR data?
Making Data FAIR on WikiData - Andra Waagmeester
What is the one key change you think needs to happen for us
to implement FAIR data and FAIR services?
Acknowledgements ● Elvira Mitraka
(Disease Ontology, U
Baltimore)
● Gang Fu, Evan Bolton
(NIH, PubChem)
● Wikimedia Foundation
Thousands of WikiDatans
Andrew Su
Funding
● National Institutes of General Medical
Sciences (R01GM089820)
● NIH Common Fund programs for Big
Data to Knowledge (U54GM114833)
Benjamin
Good
Sebastian
Burgstaller
Lynn Schriml
Núria Queralt Rosinach
Gregory Stupp
Tim Putman
Chunlei Wu
By Helpameout - Own work, CC BY-SA 3.0,
https://commons.wikimedia.org/w/index.php?c
urid=20337311
Acknowledgements ● Elvira Mitraka
(Disease Ontology, U
Baltimore)
● Gang Fu, Evan Bolton
(NIH, PubChem)
● Wikimedia Foundation
Thousands of WikiDatans
Andrew Su
Funding
● National Institutes of General Medical
Sciences (R01GM089820)
● NIH Common Fund programs for Big
Data to Knowledge (U54GM114833)
Benjamin
Good
Sebastian
Burgstaller
Lynn Schriml
Núria Queralt Rosinach
Gregory Stupp
Tim Putman
Chunlei Wu
By Helpameout - Own work, CC BY-SA 3.0,
https://commons.wikimedia.org/w/index.php?c
urid=20337311
Sebastian Burgstaller
Center for Integrative Bioinformatics,
Max Perutz Laboratories,
Vienna Biocenter.
Availability
42
Gene Wiki: github.com/SuLab/GeneWikiCentral
github.com/SuLab/wikidataintegrator – python module
github.com/SuLab/scheduled-bots – bot automation framework
github.com/SuLab/Genewiki-ShEx – data models
Wikimedia ecosystem for FAIR data
www.wikidata.org - Main entry
www.wikibase.org - Infrastructure stack
github.com/wmde/wikibase-docker - Wikibase docker
query.wikidata.org - Wikidata Query Service
wikidata-shex.wmflabs.org/wiki/Main_Page - Schema
Tools
Chlambase: chlambase.org
Scholia: tools.wmflabs.org/scholia/
Open Refine:
www.wikidata.org/wiki/Wikidata:Tools/OpenRefine
Inventaire: inventaire.io
…… (many more)

Más contenido relacionado

La actualidad más candente

Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteGigaScience, BGI Hong Kong
 
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...dgarijo
 
Prototype Crop Wild Relatives Portal, at the IMC Meeting (2007)
Prototype Crop Wild Relatives Portal, at the IMC Meeting (2007)Prototype Crop Wild Relatives Portal, at the IMC Meeting (2007)
Prototype Crop Wild Relatives Portal, at the IMC Meeting (2007)Dag Endresen
 
pro-iBiosphere 2013-05 Linked Open Data (Gregor Hagedorn)
pro-iBiosphere 2013-05 Linked Open Data (Gregor Hagedorn)pro-iBiosphere 2013-05 Linked Open Data (Gregor Hagedorn)
pro-iBiosphere 2013-05 Linked Open Data (Gregor Hagedorn)Gregor Hagedorn
 
Data exchange alternatives, GIGA TAG (2009)
Data exchange alternatives, GIGA TAG (2009)Data exchange alternatives, GIGA TAG (2009)
Data exchange alternatives, GIGA TAG (2009)Dag Endresen
 
Publishing the Full Research Data Lifecycle
Publishing the Full Research Data LifecyclePublishing the Full Research Data Lifecycle
Publishing the Full Research Data LifecycleAnita de Waard
 
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013Frauke Ziedorn
 
Bioschemas findability and interoperability
Bioschemas findability and interoperabilityBioschemas findability and interoperability
Bioschemas findability and interoperabilityBioschemas
 
RDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupRDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupAnita de Waard
 
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...Data Beers
 
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)Dag Endresen
 
EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010)
EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010)EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010)
EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010)Dag Endresen
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Juan Antonio Vizcaino
 
Using Neo4j for exploring the research graph connections made by RD-Switchboard
Using Neo4j for exploring the research graph connections made by RD-SwitchboardUsing Neo4j for exploring the research graph connections made by RD-Switchboard
Using Neo4j for exploring the research graph connections made by RD-Switchboardamiraryani
 
An open science introduction. Olinfer 18, La havana, Cuba 12-14 nov 2018
An open science introduction.  Olinfer 18, La havana, Cuba 12-14 nov 2018An open science introduction.  Olinfer 18, La havana, Cuba 12-14 nov 2018
An open science introduction. Olinfer 18, La havana, Cuba 12-14 nov 2018pascal aventurier
 
Proteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomicsProteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomicsJuan Antonio Vizcaino
 
From Open Access to Open data, our initiatives
From Open Access to Open data, our initiativesFrom Open Access to Open data, our initiatives
From Open Access to Open data, our initiativesJohannes Keizer
 
Experiences to learn from the MS proteomics field
Experiences to learn from the MS proteomics fieldExperiences to learn from the MS proteomics field
Experiences to learn from the MS proteomics fieldJuan Antonio Vizcaino
 
Observing Linked Data Dynamics
Observing Linked Data DynamicsObserving Linked Data Dynamics
Observing Linked Data Dynamicskaefer3000
 
OpenAIRE in 8 minutes - Introduction to European einfrastructures session at ...
OpenAIRE in 8 minutes - Introduction to European einfrastructures session at ...OpenAIRE in 8 minutes - Introduction to European einfrastructures session at ...
OpenAIRE in 8 minutes - Introduction to European einfrastructures session at ...OpenAIRE
 

La actualidad más candente (20)

Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByte
 
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
 
Prototype Crop Wild Relatives Portal, at the IMC Meeting (2007)
Prototype Crop Wild Relatives Portal, at the IMC Meeting (2007)Prototype Crop Wild Relatives Portal, at the IMC Meeting (2007)
Prototype Crop Wild Relatives Portal, at the IMC Meeting (2007)
 
pro-iBiosphere 2013-05 Linked Open Data (Gregor Hagedorn)
pro-iBiosphere 2013-05 Linked Open Data (Gregor Hagedorn)pro-iBiosphere 2013-05 Linked Open Data (Gregor Hagedorn)
pro-iBiosphere 2013-05 Linked Open Data (Gregor Hagedorn)
 
Data exchange alternatives, GIGA TAG (2009)
Data exchange alternatives, GIGA TAG (2009)Data exchange alternatives, GIGA TAG (2009)
Data exchange alternatives, GIGA TAG (2009)
 
Publishing the Full Research Data Lifecycle
Publishing the Full Research Data LifecyclePublishing the Full Research Data Lifecycle
Publishing the Full Research Data Lifecycle
 
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
DOI registration with DataCite - COOPEUS, ENVRI, EUDAT workshop 2013
 
Bioschemas findability and interoperability
Bioschemas findability and interoperabilityBioschemas findability and interoperability
Bioschemas findability and interoperability
 
RDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupRDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest Group
 
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
[Databeers] 06/05/2014 - Boris Villazon: “Data Integration - A Linked Data ap...
 
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
 
EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010)
EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010)EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010)
EURISCO and GBIF IPT, at the Vavilov Institute in St Petersburg (27 April 2010)
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
 
Using Neo4j for exploring the research graph connections made by RD-Switchboard
Using Neo4j for exploring the research graph connections made by RD-SwitchboardUsing Neo4j for exploring the research graph connections made by RD-Switchboard
Using Neo4j for exploring the research graph connections made by RD-Switchboard
 
An open science introduction. Olinfer 18, La havana, Cuba 12-14 nov 2018
An open science introduction.  Olinfer 18, La havana, Cuba 12-14 nov 2018An open science introduction.  Olinfer 18, La havana, Cuba 12-14 nov 2018
An open science introduction. Olinfer 18, La havana, Cuba 12-14 nov 2018
 
Proteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomicsProteomics public data resources: enabling "big data" analysis in proteomics
Proteomics public data resources: enabling "big data" analysis in proteomics
 
From Open Access to Open data, our initiatives
From Open Access to Open data, our initiativesFrom Open Access to Open data, our initiatives
From Open Access to Open data, our initiatives
 
Experiences to learn from the MS proteomics field
Experiences to learn from the MS proteomics fieldExperiences to learn from the MS proteomics field
Experiences to learn from the MS proteomics field
 
Observing Linked Data Dynamics
Observing Linked Data DynamicsObserving Linked Data Dynamics
Observing Linked Data Dynamics
 
OpenAIRE in 8 minutes - Introduction to European einfrastructures session at ...
OpenAIRE in 8 minutes - Introduction to European einfrastructures session at ...OpenAIRE in 8 minutes - Introduction to European einfrastructures session at ...
OpenAIRE in 8 minutes - Introduction to European einfrastructures session at ...
 

Similar a Making Data FAIR on WikiData - Andra Waagmeester

dkNET Annual Meeting - June 2017
dkNET Annual Meeting - June 2017dkNET Annual Meeting - June 2017
dkNET Annual Meeting - June 2017dkNET
 
Opportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationOpportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationBenjamin Good
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...Carole Goble
 
Software Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The UglySoftware Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The UglyJoão André Carriço
 
Wikidata: Verifiable, Linked Open Knowledge That Anyone Can Edit
Wikidata: Verifiable, Linked Open Knowledge That Anyone Can EditWikidata: Verifiable, Linked Open Knowledge That Anyone Can Edit
Wikidata: Verifiable, Linked Open Knowledge That Anyone Can EditDario Taraborelli
 
Global Information Systems for Plant Genetic Resources (2009)
Global Information Systems for Plant Genetic Resources (2009)Global Information Systems for Plant Genetic Resources (2009)
Global Information Systems for Plant Genetic Resources (2009)Dag Endresen
 
Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Sanjay Padhi, Ph.D
 
Building a Distributed Collaborative Data Pipeline with Apache Spark
Building a Distributed Collaborative Data Pipeline with Apache SparkBuilding a Distributed Collaborative Data Pipeline with Apache Spark
Building a Distributed Collaborative Data Pipeline with Apache SparkDatabricks
 
User-friendly bioinformatics (Monthly Informational workshop)
User-friendly bioinformatics (Monthly Informational workshop)User-friendly bioinformatics (Monthly Informational workshop)
User-friendly bioinformatics (Monthly Informational workshop)Elia Brodsky
 
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...Dag Endresen
 
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...GigaScience, BGI Hong Kong
 
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...Dr. Haxel Consult
 
Semantic Web Adoption
Semantic Web AdoptionSemantic Web Adoption
Semantic Web Adoptionguest262aaa
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...GigaScience, BGI Hong Kong
 
TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...keesvb
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)OpenAIRE
 

Similar a Making Data FAIR on WikiData - Andra Waagmeester (20)

dkNET Annual Meeting - June 2017
dkNET Annual Meeting - June 2017dkNET Annual Meeting - June 2017
dkNET Annual Meeting - June 2017
 
Opportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocurationOpportunities and challenges presented by Wikidata in the context of biocuration
Opportunities and challenges presented by Wikidata in the context of biocuration
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
Software Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The UglySoftware Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The Ugly
 
Wikidata: Verifiable, Linked Open Knowledge That Anyone Can Edit
Wikidata: Verifiable, Linked Open Knowledge That Anyone Can EditWikidata: Verifiable, Linked Open Knowledge That Anyone Can Edit
Wikidata: Verifiable, Linked Open Knowledge That Anyone Can Edit
 
Global Information Systems for Plant Genetic Resources (2009)
Global Information Systems for Plant Genetic Resources (2009)Global Information Systems for Plant Genetic Resources (2009)
Global Information Systems for Plant Genetic Resources (2009)
 
Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021
 
Building a Distributed Collaborative Data Pipeline with Apache Spark
Building a Distributed Collaborative Data Pipeline with Apache SparkBuilding a Distributed Collaborative Data Pipeline with Apache Spark
Building a Distributed Collaborative Data Pipeline with Apache Spark
 
FAIRsharing & FAIRcookbook at RDA 2023
FAIRsharing & FAIRcookbook at RDA 2023FAIRsharing & FAIRcookbook at RDA 2023
FAIRsharing & FAIRcookbook at RDA 2023
 
User-friendly bioinformatics (Monthly Informational workshop)
User-friendly bioinformatics (Monthly Informational workshop)User-friendly bioinformatics (Monthly Informational workshop)
User-friendly bioinformatics (Monthly Informational workshop)
 
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
 
FAIRsharing and the FAIR Cookbook
FAIRsharing and the FAIR Cookbook FAIRsharing and the FAIR Cookbook
FAIRsharing and the FAIR Cookbook
 
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
 
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
AI-SDV 2020: Combining Knowledge and Machine Learning for the Analysis of Sci...
 
Semantic Web Adoption
Semantic Web AdoptionSemantic Web Adoption
Semantic Web Adoption
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
 
Hahnel "Open Data Policies: Opportunities, compliance and technology strategies"
Hahnel "Open Data Policies: Opportunities, compliance and technology strategies"Hahnel "Open Data Policies: Opportunities, compliance and technology strategies"
Hahnel "Open Data Policies: Opportunities, compliance and technology strategies"
 
TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 1)
 

Más de OpenAIRE

10th OpenAIRE Content Providers Community Call
10th OpenAIRE Content Providers Community Call10th OpenAIRE Content Providers Community Call
10th OpenAIRE Content Providers Community CallOpenAIRE
 
9th Content Providers Community Call\
9th Content Providers Community Call\9th Content Providers Community Call\
9th Content Providers Community Call\OpenAIRE
 
OpenAIRE in the European Open Science Cloud (EOSC)
OpenAIRE in the European Open Science Cloud (EOSC)OpenAIRE in the European Open Science Cloud (EOSC)
OpenAIRE in the European Open Science Cloud (EOSC)OpenAIRE
 
8th Content Providers Community Call
8th Content Providers Community Call8th Content Providers Community Call
8th Content Providers Community CallOpenAIRE
 
7th Content Providers Community Call
7th Content Providers Community Call7th Content Providers Community Call
7th Content Providers Community CallOpenAIRE
 
OpenAIRE PROVIDE Dashboard for Turkish repository managers
OpenAIRE PROVIDE Dashboard for Turkish repository managersOpenAIRE PROVIDE Dashboard for Turkish repository managers
OpenAIRE PROVIDE Dashboard for Turkish repository managersOpenAIRE
 
What will it cost to manage and share my data?
What will it cost to manage and share my data?What will it cost to manage and share my data?
What will it cost to manage and share my data?OpenAIRE
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)OpenAIRE
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)OpenAIRE
 
6th Content Providers Community Call
6th Content Providers Community Call6th Content Providers Community Call
6th Content Providers Community CallOpenAIRE
 
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing DataOpenAIRE
 
20200504_Research Data & the GDPR: How Open is Open?
20200504_Research Data & the GDPR: How Open is Open?20200504_Research Data & the GDPR: How Open is Open?
20200504_Research Data & the GDPR: How Open is Open?OpenAIRE
 
20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open Science20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open ScienceOpenAIRE
 
20200429_Research Data & the GDPR: How Open is Open? (updated version)
20200429_Research Data & the GDPR: How Open is Open? (updated version)20200429_Research Data & the GDPR: How Open is Open? (updated version)
20200429_Research Data & the GDPR: How Open is Open? (updated version)OpenAIRE
 
20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open Science20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open ScienceOpenAIRE
 
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing DataOpenAIRE
 
COVID-19: Activities, tools, best practice and contact points in Greece
 COVID-19: Activities, tools, best practice and contact points in Greece COVID-19: Activities, tools, best practice and contact points in Greece
COVID-19: Activities, tools, best practice and contact points in GreeceOpenAIRE
 
5th Content Providers Community Call
5th Content Providers Community Call5th Content Providers Community Call
5th Content Providers Community CallOpenAIRE
 
4th Content Providers Community Call
4th Content Providers Community Call4th Content Providers Community Call
4th Content Providers Community CallOpenAIRE
 
3rd Content Providers Community Call
3rd Content Providers Community Call3rd Content Providers Community Call
3rd Content Providers Community CallOpenAIRE
 

Más de OpenAIRE (20)

10th OpenAIRE Content Providers Community Call
10th OpenAIRE Content Providers Community Call10th OpenAIRE Content Providers Community Call
10th OpenAIRE Content Providers Community Call
 
9th Content Providers Community Call\
9th Content Providers Community Call\9th Content Providers Community Call\
9th Content Providers Community Call\
 
OpenAIRE in the European Open Science Cloud (EOSC)
OpenAIRE in the European Open Science Cloud (EOSC)OpenAIRE in the European Open Science Cloud (EOSC)
OpenAIRE in the European Open Science Cloud (EOSC)
 
8th Content Providers Community Call
8th Content Providers Community Call8th Content Providers Community Call
8th Content Providers Community Call
 
7th Content Providers Community Call
7th Content Providers Community Call7th Content Providers Community Call
7th Content Providers Community Call
 
OpenAIRE PROVIDE Dashboard for Turkish repository managers
OpenAIRE PROVIDE Dashboard for Turkish repository managersOpenAIRE PROVIDE Dashboard for Turkish repository managers
OpenAIRE PROVIDE Dashboard for Turkish repository managers
 
What will it cost to manage and share my data?
What will it cost to manage and share my data?What will it cost to manage and share my data?
What will it cost to manage and share my data?
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 3)
 
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
Open Research Gateway for the ELIXIR-GR Infrastructure (Part 2)
 
6th Content Providers Community Call
6th Content Providers Community Call6th Content Providers Community Call
6th Content Providers Community Call
 
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200504_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
 
20200504_Research Data & the GDPR: How Open is Open?
20200504_Research Data & the GDPR: How Open is Open?20200504_Research Data & the GDPR: How Open is Open?
20200504_Research Data & the GDPR: How Open is Open?
 
20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open Science20200504_Data, Data Ownership and Open Science
20200504_Data, Data Ownership and Open Science
 
20200429_Research Data & the GDPR: How Open is Open? (updated version)
20200429_Research Data & the GDPR: How Open is Open? (updated version)20200429_Research Data & the GDPR: How Open is Open? (updated version)
20200429_Research Data & the GDPR: How Open is Open? (updated version)
 
20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open Science20200429_Data, Data Ownership and Open Science
20200429_Data, Data Ownership and Open Science
 
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
20200429_OpenAIRE Legal Policy Webinar: GDPR and Sharing Data
 
COVID-19: Activities, tools, best practice and contact points in Greece
 COVID-19: Activities, tools, best practice and contact points in Greece COVID-19: Activities, tools, best practice and contact points in Greece
COVID-19: Activities, tools, best practice and contact points in Greece
 
5th Content Providers Community Call
5th Content Providers Community Call5th Content Providers Community Call
5th Content Providers Community Call
 
4th Content Providers Community Call
4th Content Providers Community Call4th Content Providers Community Call
4th Content Providers Community Call
 
3rd Content Providers Community Call
3rd Content Providers Community Call3rd Content Providers Community Call
3rd Content Providers Community Call
 

Último

geometric quantization on coadjoint orbits
geometric quantization on coadjoint orbitsgeometric quantization on coadjoint orbits
geometric quantization on coadjoint orbitsHassan Jolany
 
biosynthesis of the cell wall and antibiotics
biosynthesis of the cell wall and antibioticsbiosynthesis of the cell wall and antibiotics
biosynthesis of the cell wall and antibioticsSafaFallah
 
Controlling Parameters of Carbonate platform Environment
Controlling Parameters of Carbonate platform EnvironmentControlling Parameters of Carbonate platform Environment
Controlling Parameters of Carbonate platform EnvironmentRahulVishwakarma71547
 
IB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptxIB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptxUalikhanKalkhojayev1
 
Pests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPRPests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPRPirithiRaju
 
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdf
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdfPests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdf
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdfPirithiRaju
 
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPirithiRaju
 
Role of herbs in hair care Amla and heena.pptx
Role of herbs in hair care  Amla and  heena.pptxRole of herbs in hair care  Amla and  heena.pptx
Role of herbs in hair care Amla and heena.pptxVaishnaviAware
 
Gender board diversity spillovers and the public eye
Gender board diversity spillovers and the public eyeGender board diversity spillovers and the public eye
Gender board diversity spillovers and the public eyeGRAPE
 
Human brain.. It's parts and function.
Human brain.. It's parts and function. Human brain.. It's parts and function.
Human brain.. It's parts and function. MUKTA MANJARI SAHOO
 
RCPE terms and cycles scenarios as of March 2024
RCPE terms and cycles scenarios as of March 2024RCPE terms and cycles scenarios as of March 2024
RCPE terms and cycles scenarios as of March 2024suelcarter1
 
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdfSUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdfsantiagojoderickdoma
 
An intro to explainable AI for polar climate science
An intro to  explainable AI for  polar climate scienceAn intro to  explainable AI for  polar climate science
An intro to explainable AI for polar climate scienceZachary Labe
 
Basic Concepts in Pharmacology in molecular .pptx
Basic Concepts in Pharmacology in molecular  .pptxBasic Concepts in Pharmacology in molecular  .pptx
Basic Concepts in Pharmacology in molecular .pptxVijayaKumarR28
 
soft skills question paper set for bba ca
soft skills question paper set for bba casoft skills question paper set for bba ca
soft skills question paper set for bba caohsadfeeling
 
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...Sérgio Sacani
 
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxTHE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxAkinrotimiOluwadunsi
 
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...PirithiRaju
 
Genomics and Bioinformatics basics from genome to phenome
Genomics and Bioinformatics basics from genome to phenomeGenomics and Bioinformatics basics from genome to phenome
Genomics and Bioinformatics basics from genome to phenomeAjay Kumar Mahato
 
Applied Biochemistry feedback_M Ahwad 2023.docx
Applied Biochemistry feedback_M Ahwad 2023.docxApplied Biochemistry feedback_M Ahwad 2023.docx
Applied Biochemistry feedback_M Ahwad 2023.docxmarwaahmad357
 

Último (20)

geometric quantization on coadjoint orbits
geometric quantization on coadjoint orbitsgeometric quantization on coadjoint orbits
geometric quantization on coadjoint orbits
 
biosynthesis of the cell wall and antibiotics
biosynthesis of the cell wall and antibioticsbiosynthesis of the cell wall and antibiotics
biosynthesis of the cell wall and antibiotics
 
Controlling Parameters of Carbonate platform Environment
Controlling Parameters of Carbonate platform EnvironmentControlling Parameters of Carbonate platform Environment
Controlling Parameters of Carbonate platform Environment
 
IB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptxIB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptx
 
Pests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPRPests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPR
 
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdf
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdfPests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdf
Pests of cumbu_Identification, Binomics, Integrated ManagementDr.UPR.pdf
 
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
 
Role of herbs in hair care Amla and heena.pptx
Role of herbs in hair care  Amla and  heena.pptxRole of herbs in hair care  Amla and  heena.pptx
Role of herbs in hair care Amla and heena.pptx
 
Gender board diversity spillovers and the public eye
Gender board diversity spillovers and the public eyeGender board diversity spillovers and the public eye
Gender board diversity spillovers and the public eye
 
Human brain.. It's parts and function.
Human brain.. It's parts and function. Human brain.. It's parts and function.
Human brain.. It's parts and function.
 
RCPE terms and cycles scenarios as of March 2024
RCPE terms and cycles scenarios as of March 2024RCPE terms and cycles scenarios as of March 2024
RCPE terms and cycles scenarios as of March 2024
 
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdfSUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
 
An intro to explainable AI for polar climate science
An intro to  explainable AI for  polar climate scienceAn intro to  explainable AI for  polar climate science
An intro to explainable AI for polar climate science
 
Basic Concepts in Pharmacology in molecular .pptx
Basic Concepts in Pharmacology in molecular  .pptxBasic Concepts in Pharmacology in molecular  .pptx
Basic Concepts in Pharmacology in molecular .pptx
 
soft skills question paper set for bba ca
soft skills question paper set for bba casoft skills question paper set for bba ca
soft skills question paper set for bba ca
 
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...
 
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxTHE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
 
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
 
Genomics and Bioinformatics basics from genome to phenome
Genomics and Bioinformatics basics from genome to phenomeGenomics and Bioinformatics basics from genome to phenome
Genomics and Bioinformatics basics from genome to phenome
 
Applied Biochemistry feedback_M Ahwad 2023.docx
Applied Biochemistry feedback_M Ahwad 2023.docxApplied Biochemistry feedback_M Ahwad 2023.docx
Applied Biochemistry feedback_M Ahwad 2023.docx
 

Making Data FAIR on WikiData - Andra Waagmeester

  • 2. Structure ● Introduction ○ Wikidata ○ Wikibase ○ Gene Wiki ● Bits and Bolts ○ How to get your data into Wikidata/Wikibase ○ Using Wikidata/Wikibase ● Is Wikidata FAIR? ○ Where is it located in the research life cycle ○ How FAIR is Wikidata/Wikibase/Genewiki ○ interoperate/collaborate/engage ○ one key change to successfully implement FAIR data infrastructure
  • 4. Wikidata is to data as Wikipedia is to text Wikidata is a collaboratively edited knowledge base operated by the Wikimedia Foundation ● Completely free, even for commercial usage (CC0) ● Anybody can contribute ● Covers all domains of knowledge ● Extensive item history, talk pages, projects, users ● Integration with the semantic web ● High performance query engine (SPARQL) ● Stable! Long term support not dictated by funding cycles ● Actively developed ● Already has large number of active users, editors, contributors! A giant graph of knowledge!
  • 5. FAIR on Wikidata The Wikimedia infrastructure Infrastructure Resource Content Where? Data Text Schema Under developement https://www.wikidata.org https://www.wikibase.org https://<lang>.wikipedia.org https://releases.wikimedia.org/mediawiki/ https://shex.io https://wikidata-shex.wmflabs.org/
  • 6. FAIR on Wikidata: Providing identifiers http://ceur-ws.org/Vol-2275/short1.pdf
  • 7. FAIR on Wikidata: Providing identifiers
  • 14. The Gene Wiki project, circa 2008 1 4 Huss, PLoS Biol, 2008 Data imported from structured databases Summarized knowledge via crowdsourcing
  • 15. 1 5
  • 19. Community engagement and model discussion
  • 20. Formally capture and describe model and community consensus Model development ● Legacy review – develop punch lists for existing data issues that needs fixing ● Documentation – terse, human-readable representation helping contributers and maintainers quickly grok the model ● Client pre-submission – submitters test their data before submission to make sure they’re saying what they want to say and that the receiving schema can accommodate all of their data ● Server pre-ingestion – submission process checks data as it comes in and either rejects or warns about non-conformant data
  • 21. ● Model structure of items (genes, drugs, diseases, .. etc) & relationships between items ● Import data from many sources and ontologies ● Linked to many identifiers from external databases ● Architecture for maintaining data from external sources Seeding with data
  • 22. Platforms for (semi-) automatic data ingestion ● Quickstatements: https://tools.wmflabs.org/quickstatements/ ● Open Refine: https://www.wikidata.org/wiki/Wikidata:Tools/OpenRefine ● Wikidata Integrator: https://github.com/SuLab/WikidataIntegrator
  • 24. Simple data retrieval 24 39 genes gene geneLabel gene geneLabel gene geneLabel gene geneLabel Q5013317 COL22A1 Q18027370 IGSF3 Q18053559 CDHR3 Q14903974 SMAD3 Q14912759 SLC22A5 Q18045382 HPSE2 Q18045669 ATG3 Q18033889 IL1RL1 Q14914243 PSAP Q18048437 IL33 Q18035037 RAD50 Q17917202 ERBB4 Q14907990 SLC30A8 Q18051900 PYHIN1 Q18036984 FBXL7 Q18027836 IL6R Q18025002 GAB1 Q17709208 ACO1 Q18033919 XPR1 Q18030185 NOTCH4 Q18035589 C6orf10 Q18027822 IL2RB Q15326496 RORA Q18030409 PDE4D Q18054256 GSDMA Q18030364 PBX2 Q18042132 GSDMB Q18045645 IKZF4 Q18058487 C5orf56 Q18037773 ABI3BP Q18029145 MKLN1 Q18039979 KLHL5 Q18030785 PRKG1 Q18039623 CTNNA3 Q18036729 RAP1GAP2 Q18026947 HLA-DQA1 Q18033424 IL18R1 Q18046350 ZNF665 Q14878303 IL13 “Retrieve genes with GWAS association with asthma” http://bit.ly/bosc2017_wikidata
  • 25. Data integration 25 “Retrieve genes with GWAS association with asthma and gene product is localized to membrane” gene geneLabel gene geneLabel gene geneLabel gene geneLabel Q1491275 9 SLC22A5 Q1802737 0 IGSF3 Q1803503 7 RAD50 Q1802783 6 IL6R Q1491424 3 PSAP Q1803342 4 IL18R1 Q1803391 9 XPR1 Q1803040 9 PDE4D Q1490799 0 SLC30A8 Q1804538 2 HPSE2 Q1804213 2 GSDMB Q1803018 5 NOTCH4 Q1803558 9 C6orf10 Q1802782 2 IL2RB Q1803672 9 RAP1GAP2 Q1802694 7 HLA-DQA1 Q1805425 6 GSDMA Q1805355 9 CDHR3 Q1803388 9 IL1RL1 Q1803078 5 PRKG1 Q1490397 4 SMAD3 Q1791720 2 ERBB4 22 genes http://bit.ly/bosc2017_wikidata
  • 26. Leveraging the Disease Ontology structure 26 “Retrieve genes with GWAS association with any respiratory disease and gene product is localized to membrane (non-IEA)” 31 genes / 8 diseases diseaseGALabel gene_countsgeneList asthma 15 SMAD3, RAP1GAP2, IL18R1, HPSE2, SLC30A8, SLC22A5, PSAP, ERBB4, HLA-DQA1, IGSF3, IL2RB, IL6R, NOTCH4, PDE4D, RAD50 chronic obstructive pulmonary disease 5 HLA-C, SFTPD, ANXA5, ANXA11, ATP2C2 lung cancer 3 TGM5, VTI1A, PHACTR2 interstitial lung disease 2 DSP, ATP11A non-small-cell lung carcinoma 2 NALCN, DLST nasopharynx carcinoma 2 ITGA9, TNFRSF19 adenocarcinoma of the lung 1 BTNL2 pulmonary emphysema 1 BICD1 http://bit.ly/bosc2017_wikidata
  • 27. ... and show associated pathways 27 gene pathway SMAD3 Androgen receptor signaling pathway SMAD3 TGF-beta Receptor Signaling SMAD3 mechlorethamine exposure HLA-C Allograft Rejection SFTPD Regulation of toll-like receptor signaling pathway …. ….. “Retrieve genes with GWAS association with any respiratory disease and gene product is localized to membrane (non-IEA), show causative chemical hazards and show pathways where they have a role.” 16 genes / 59 pathways http://bit.ly/bosc2017_wikidata
  • 33. What makes a service FAIR?
  • 36. Where is Wikimedia ecosystem located in the Research Data Lifecycle ● Data creation ● Documentation ● Discovery ● Sharing ● Reuse
  • 37. How do you interoperate/collaborate/engage with other services to help support FAIR data?
  • 39. What is the one key change you think needs to happen for us to implement FAIR data and FAIR services?
  • 40. Acknowledgements ● Elvira Mitraka (Disease Ontology, U Baltimore) ● Gang Fu, Evan Bolton (NIH, PubChem) ● Wikimedia Foundation Thousands of WikiDatans Andrew Su Funding ● National Institutes of General Medical Sciences (R01GM089820) ● NIH Common Fund programs for Big Data to Knowledge (U54GM114833) Benjamin Good Sebastian Burgstaller Lynn Schriml Núria Queralt Rosinach Gregory Stupp Tim Putman Chunlei Wu By Helpameout - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?c urid=20337311
  • 41. Acknowledgements ● Elvira Mitraka (Disease Ontology, U Baltimore) ● Gang Fu, Evan Bolton (NIH, PubChem) ● Wikimedia Foundation Thousands of WikiDatans Andrew Su Funding ● National Institutes of General Medical Sciences (R01GM089820) ● NIH Common Fund programs for Big Data to Knowledge (U54GM114833) Benjamin Good Sebastian Burgstaller Lynn Schriml Núria Queralt Rosinach Gregory Stupp Tim Putman Chunlei Wu By Helpameout - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?c urid=20337311 Sebastian Burgstaller Center for Integrative Bioinformatics, Max Perutz Laboratories, Vienna Biocenter.
  • 42. Availability 42 Gene Wiki: github.com/SuLab/GeneWikiCentral github.com/SuLab/wikidataintegrator – python module github.com/SuLab/scheduled-bots – bot automation framework github.com/SuLab/Genewiki-ShEx – data models Wikimedia ecosystem for FAIR data www.wikidata.org - Main entry www.wikibase.org - Infrastructure stack github.com/wmde/wikibase-docker - Wikibase docker query.wikidata.org - Wikidata Query Service wikidata-shex.wmflabs.org/wiki/Main_Page - Schema Tools Chlambase: chlambase.org Scholia: tools.wmflabs.org/scholia/ Open Refine: www.wikidata.org/wiki/Wikidata:Tools/OpenRefine Inventaire: inventaire.io …… (many more)