SlideShare una empresa de Scribd logo
1 de 22
Hazard Estimation and Method Comparison with OWL-Encoded Toxicity Decision TreesLeonid L. Chepelev, Dana Klassen, and Michel DumontierDepartment of Biology, Institute of Biochemistry, School of Computer ScienceCarleton University Ottawa, Canada An OWLED 2011 Paper
Motivation Machine learning approaches such as decision trees are commonly used in toxicity prediction However, interpretation of complex trees can be difficult to interpret, and there is no explanation for the category obtained. Moreover, many variant decision trees are coming out, difficult to compare Can we use OWL ontologies to formally represent and compare decision trees? A simple toxicity decision tree: at each branching point, a rule is evaluated, and based on the outcome of this rule, either a final activity decision is made, or judgment is deferred to another node.  2
Druglikeness: Lipinski’s Rule of Five Rule of thumb for druglikeness (orally active in humans) 	(4 rules with multiples of 5) mass of 500 Daltons or less 5 hydrogen bond donors or less 10 hydrogen bond acceptors or less A partition coefficient (logP) value between -5 and 5 Multiple conditions that must be satisfied to be considered druglike.  A molecule must failing any of these would not be drug like. 3
Chemical Data Lipinski drug-likeness dataset comprised of 7000 compounds from the Human Metabolome Database (HMDB).  attributes computed using the Chemistry Development Kit. Tree built with open source Weka - collection of machine learning algorithms for data mining tools for data pre-processing, classification, regression, clustering, association rules, and visualization.  4
Rule of Five Decision Tree Correctly classified molecule counts are given in brackets.  100% accuracy in ten-fold cross validation. 5
Formalization Substance I  subClassOf Substance II Substance III A substance I is something that has a molecular weight    Substance II is a kind of substance I that has a molecular weight <= 500 Substance III is a kind of substance I that has a molecular weight > 500 6
Formalization subClassOf Substance I I Substance I  has  attribute has  attribute Molecular Weight  Molecular Weight  has  value Every node in the decision tree represents an entity having a attribute or feature, whose value may be specified substance I is something that has a molecular weight  ‘substance I’ equivalentClass  ‘has attribute’ some ‘molecular weight’ substance II is a kind of substance I with a specified  ‘substance II’ equivalentClass  ‘substance I’    and ‘has attribute’ some (‘molecular weight’           and ‘has value’ double[<= 499.296759])) >499.296759 7
The Chemical Information Ontology (CHEMINF) 100+ chemical descriptors 50+ chemical qualities Relates descriptors to their specifications, the software that generated them (along with the running parameters, and the algorithms that they implement) Contributors: Nico Adams, Leonid Chepelev, Michel Dumontier, Janna Hastings, EgonWillighagen, Peter Murray-Rust, Cristoph Steinbeck 8 http://semanticchemistry.googlecode.com
A simple decision tree can be represented as a set of subsuming OWL classes Methods: A WEKA tree was trained and serialized into dot format. Used the Weka API to read the document and create the ontology using the OWL API. 9
Each outcome may also be formalized in terms of the set of all attributes as obtained by drawing a path to the root Druglike-moleculeequivalentClass ‘molecule’ and ‘has attribute’ some (‘molecular weight’ that ‘has value’ double[<= 500.0]) and ‘has attribute’ some (‘hydrogen bond count donor count’ that ‘has value’ int[<= 5]) and ‘has attribute’ some (‘hydrogen bond acceptor count’ that ‘has value’ int[<= 10]) and ‘has attribute’ some (‘partition coefficient’ that ‘has value’ double[<= 5.0, >= -5.0]) 10
Large scale decision trees Lipinski example is typically trivial Can we create a new decision tree capable of classification of linked data Obtained 1400 chemicals from an EPA ToxCast carcinogenic toxicity dataset labelled either toxic or non-toxic Computed 318 boolean features using the ToxTree API. http://toxtree.sourceforge.net/ Generated the decision tree using Weka Generated the OWL ontology using the OWL API Generated individuals using the CHESS specification and used descriptors specified in the CHEMINF ontology. Classification using OWL API + Pellet; Protégé 4 and Hermit. 11
A decision tree to predict carcinogenic toxicity 12
Decision Tree to OWL Ontology 13
Is acetaminophen toxic? 14
From data to automated reasoning data linked data Automated Reasoning  (realization) over  OWL encoded  Toxicity tree 15
16
17
Path through Decision Tree kindly provided by reasoning about the OWL ontology 18
Comparison of toxicity trees Along with the standard lipinski rule of five ontology, we generated a variant where MW <= 250.  Reasoning over the two ontologies, we see that the active compound (based on the MW <= 250) is subsumed by the active compound based on MW <= 500 19
Conclusion Decision trees can be faithfully represented as OWL ontologies As formalized ontologies, we can automatically reason about the ontology, and use it to classify new chemicals (hence predict toxicity) If we maintain the structure of the decision tree, we can get explanations to provide the set of attributes used in the decision making (unlike black box counterpart). Expectation that trees generated with different, but aligned vocabularies may now be comparable 20
Acknowledgements CHEMINF Group Leo Chepelev Janna Hastings EgonWillighagen Nico Adams Toxicity Group Leo Chepelev Dana Klassen 21
   dumontierlab.com michel_dumontier@carleton.ca Presentations: http://slideshare.com/micheldumontier 22

Más contenido relacionado

Destacado

Balaur.ro - Cristian George Strat
Balaur.ro - Cristian George StratBalaur.ro - Cristian George Strat
Balaur.ro - Cristian George StratGeekMeet
 
iPhone and Appstore
iPhone and AppstoreiPhone and Appstore
iPhone and AppstoreHome
 
Tactical Formalization of Linked Open Data (Ontology Summit 2014)
Tactical Formalization of Linked Open Data (Ontology Summit 2014)Tactical Formalization of Linked Open Data (Ontology Summit 2014)
Tactical Formalization of Linked Open Data (Ontology Summit 2014)Michel Dumontier
 
(Online) Censorship in Southeast Asia | #rp15
(Online) Censorship in Southeast Asia | #rp15(Online) Censorship in Southeast Asia | #rp15
(Online) Censorship in Southeast Asia | #rp15Sascha Funk
 
Social Media and Fundraisng - are you prepared?
Social Media and Fundraisng - are you prepared? Social Media and Fundraisng - are you prepared?
Social Media and Fundraisng - are you prepared? Noesium Consulting
 
Charity and Email
Charity and EmailCharity and Email
Charity and Emailraneez
 
CTS 교회반주자를 위한 PC3 교육용 자료
CTS 교회반주자를 위한 PC3 교육용 자료CTS 교회반주자를 위한 PC3 교육용 자료
CTS 교회반주자를 위한 PC3 교육용 자료Yoon Lee
 
Design Thinking in EFL Context
Design Thinking in EFL ContextDesign Thinking in EFL Context
Design Thinking in EFL ContextDebopriyo Roy
 
Gen X and Y at work [AMI Conf / Sydney / Sep 08]
Gen X and Y at work [AMI Conf / Sydney / Sep 08]Gen X and Y at work [AMI Conf / Sydney / Sep 08]
Gen X and Y at work [AMI Conf / Sydney / Sep 08]Jason Dunstone
 
Nastas Lecture Graduate School of Business Michgan State University
Nastas Lecture Graduate School of Business Michgan State UniversityNastas Lecture Graduate School of Business Michgan State University
Nastas Lecture Graduate School of Business Michgan State UniversityThomas Nastas
 
William Kosar What Every Budget Officer Should Know_Rwanda
William Kosar What Every Budget Officer Should Know_RwandaWilliam Kosar What Every Budget Officer Should Know_Rwanda
William Kosar What Every Budget Officer Should Know_RwandaWilliam Kosar
 
Design for Innovation (D4I) Improvement Process
Design for Innovation (D4I) Improvement ProcessDesign for Innovation (D4I) Improvement Process
Design for Innovation (D4I) Improvement ProcessIain Sanders
 
Better Search With Structured Knowledge
Better Search With Structured KnowledgeBetter Search With Structured Knowledge
Better Search With Structured KnowledgeMichel Dumontier
 
Kenenisa
KenenisaKenenisa
Kenenisargana
 
Technical Communication Lab Projects
Technical Communication Lab ProjectsTechnical Communication Lab Projects
Technical Communication Lab ProjectsDebopriyo Roy
 

Destacado (20)

Balaur.ro - Cristian George Strat
Balaur.ro - Cristian George StratBalaur.ro - Cristian George Strat
Balaur.ro - Cristian George Strat
 
iPhone and Appstore
iPhone and AppstoreiPhone and Appstore
iPhone and Appstore
 
Tactical Formalization of Linked Open Data (Ontology Summit 2014)
Tactical Formalization of Linked Open Data (Ontology Summit 2014)Tactical Formalization of Linked Open Data (Ontology Summit 2014)
Tactical Formalization of Linked Open Data (Ontology Summit 2014)
 
(Online) Censorship in Southeast Asia | #rp15
(Online) Censorship in Southeast Asia | #rp15(Online) Censorship in Southeast Asia | #rp15
(Online) Censorship in Southeast Asia | #rp15
 
Social Media and Fundraisng - are you prepared?
Social Media and Fundraisng - are you prepared? Social Media and Fundraisng - are you prepared?
Social Media and Fundraisng - are you prepared?
 
Charity and Email
Charity and EmailCharity and Email
Charity and Email
 
CTS 교회반주자를 위한 PC3 교육용 자료
CTS 교회반주자를 위한 PC3 교육용 자료CTS 교회반주자를 위한 PC3 교육용 자료
CTS 교회반주자를 위한 PC3 교육용 자료
 
Tennessee Ballot
Tennessee BallotTennessee Ballot
Tennessee Ballot
 
Design Thinking in EFL Context
Design Thinking in EFL ContextDesign Thinking in EFL Context
Design Thinking in EFL Context
 
Gen X and Y at work [AMI Conf / Sydney / Sep 08]
Gen X and Y at work [AMI Conf / Sydney / Sep 08]Gen X and Y at work [AMI Conf / Sydney / Sep 08]
Gen X and Y at work [AMI Conf / Sydney / Sep 08]
 
Banana2008
Banana2008Banana2008
Banana2008
 
Art & Papermaking as Social Action
Art & Papermaking as Social ActionArt & Papermaking as Social Action
Art & Papermaking as Social Action
 
Nastas Lecture Graduate School of Business Michgan State University
Nastas Lecture Graduate School of Business Michgan State UniversityNastas Lecture Graduate School of Business Michgan State University
Nastas Lecture Graduate School of Business Michgan State University
 
William Kosar What Every Budget Officer Should Know_Rwanda
William Kosar What Every Budget Officer Should Know_RwandaWilliam Kosar What Every Budget Officer Should Know_Rwanda
William Kosar What Every Budget Officer Should Know_Rwanda
 
Design for Innovation (D4I) Improvement Process
Design for Innovation (D4I) Improvement ProcessDesign for Innovation (D4I) Improvement Process
Design for Innovation (D4I) Improvement Process
 
Better Search With Structured Knowledge
Better Search With Structured KnowledgeBetter Search With Structured Knowledge
Better Search With Structured Knowledge
 
Flowers
FlowersFlowers
Flowers
 
Kenenisa
KenenisaKenenisa
Kenenisa
 
Technical Communication Lab Projects
Technical Communication Lab ProjectsTechnical Communication Lab Projects
Technical Communication Lab Projects
 
Lourenza
LourenzaLourenza
Lourenza
 

Similar a Hazard Estimation and Method Comparison with OWL-Encoded Toxicity Decision Trees

Talk at Yale University April 26th 2011: Applying Computational Models for To...
Talk at Yale University April 26th 2011: Applying Computational Modelsfor To...Talk at Yale University April 26th 2011: Applying Computational Modelsfor To...
Talk at Yale University April 26th 2011: Applying Computational Models for To...Sean Ekins
 
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013Prof. Wim Van Criekinge
 
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekingeProf. Wim Van Criekinge
 
BioIT Drug induced liver injury talk 2011
BioIT Drug induced liver injury talk 2011BioIT Drug induced liver injury talk 2011
BioIT Drug induced liver injury talk 2011Sean Ekins
 
Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiChris Evelo
 
Data Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionData Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionPaul Groth
 
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekingeProf. Wim Van Criekinge
 
Pharmacophore extraction from Matched Molecular Pair Analysis
Pharmacophore extraction from Matched Molecular Pair AnalysisPharmacophore extraction from Matched Molecular Pair Analysis
Pharmacophore extraction from Matched Molecular Pair AnalysisEd Griffen
 
Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014Prof. Wim Van Criekinge
 
HyQue: Evaluating scientific Hypotheses using semantic web technologies
HyQue: Evaluating scientific Hypotheses using semantic web technologiesHyQue: Evaluating scientific Hypotheses using semantic web technologies
HyQue: Evaluating scientific Hypotheses using semantic web technologiesMichel Dumontier
 
Bioinformatica 15-12-2011-t9-t10-bio cheminformatics
Bioinformatica 15-12-2011-t9-t10-bio cheminformaticsBioinformatica 15-12-2011-t9-t10-bio cheminformatics
Bioinformatica 15-12-2011-t9-t10-bio cheminformaticsProf. Wim Van Criekinge
 
Application of adverse outcome pathways in chemical risk assessment, Dan Vill...
Application of adverse outcome pathways in chemical risk assessment, Dan Vill...Application of adverse outcome pathways in chemical risk assessment, Dan Vill...
Application of adverse outcome pathways in chemical risk assessment, Dan Vill...OECD Environment
 
EDSP Prioritization: Collaborative Estrogen Receptor Activity Prediction Proj...
EDSP Prioritization: Collaborative Estrogen Receptor Activity Prediction Proj...EDSP Prioritization: Collaborative Estrogen Receptor Activity Prediction Proj...
EDSP Prioritization: Collaborative Estrogen Receptor Activity Prediction Proj...Kamel Mansouri
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europeopen_phacts
 
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-upopen_phacts
 
EDF2014: Paul Groth, Department of Computer Science & The Network Institute, ...
EDF2014: Paul Groth, Department of Computer Science & The Network Institute, ...EDF2014: Paul Groth, Department of Computer Science & The Network Institute, ...
EDF2014: Paul Groth, Department of Computer Science & The Network Institute, ...European Data Forum
 

Similar a Hazard Estimation and Method Comparison with OWL-Encoded Toxicity Decision Trees (20)

Talk at Yale University April 26th 2011: Applying Computational Models for To...
Talk at Yale University April 26th 2011: Applying Computational Modelsfor To...Talk at Yale University April 26th 2011: Applying Computational Modelsfor To...
Talk at Yale University April 26th 2011: Applying Computational Models for To...
 
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
 
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
 
BioIT Drug induced liver injury talk 2011
BioIT Drug induced liver injury talk 2011BioIT Drug induced liver injury talk 2011
BioIT Drug induced liver injury talk 2011
 
Mining public domain data as a basis for drug repurposing
Mining public domain data as a basis for drug repurposingMining public domain data as a basis for drug repurposing
Mining public domain data as a basis for drug repurposing
 
Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs api
 
Data Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionData Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tension
 
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
2016 bioinformatics i_bio_cheminformatics_wimvancriekinge
 
Pharmacophore extraction from Matched Molecular Pair Analysis
Pharmacophore extraction from Matched Molecular Pair AnalysisPharmacophore extraction from Matched Molecular Pair Analysis
Pharmacophore extraction from Matched Molecular Pair Analysis
 
Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014
 
HyQue: Evaluating scientific Hypotheses using semantic web technologies
HyQue: Evaluating scientific Hypotheses using semantic web technologiesHyQue: Evaluating scientific Hypotheses using semantic web technologies
HyQue: Evaluating scientific Hypotheses using semantic web technologies
 
Deliverable_5.1.2
Deliverable_5.1.2Deliverable_5.1.2
Deliverable_5.1.2
 
Bioinformatica 15-12-2011-t9-t10-bio cheminformatics
Bioinformatica 15-12-2011-t9-t10-bio cheminformaticsBioinformatica 15-12-2011-t9-t10-bio cheminformatics
Bioinformatica 15-12-2011-t9-t10-bio cheminformatics
 
assignment
 assignment assignment
assignment
 
Application of adverse outcome pathways in chemical risk assessment, Dan Vill...
Application of adverse outcome pathways in chemical risk assessment, Dan Vill...Application of adverse outcome pathways in chemical risk assessment, Dan Vill...
Application of adverse outcome pathways in chemical risk assessment, Dan Vill...
 
EDSP Prioritization: Collaborative Estrogen Receptor Activity Prediction Proj...
EDSP Prioritization: Collaborative Estrogen Receptor Activity Prediction Proj...EDSP Prioritization: Collaborative Estrogen Receptor Activity Prediction Proj...
EDSP Prioritization: Collaborative Estrogen Receptor Activity Prediction Proj...
 
Cadd assignment 4 (sarita)
Cadd assignment 4 (sarita)Cadd assignment 4 (sarita)
Cadd assignment 4 (sarita)
 
2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe2011-10-11 Open PHACTS at BioIT World Europe
2011-10-11 Open PHACTS at BioIT World Europe
 
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
2015-04-28 Open PHACTS at Swedish Linked Data Network Meet-up
 
EDF2014: Paul Groth, Department of Computer Science & The Network Institute, ...
EDF2014: Paul Groth, Department of Computer Science & The Network Institute, ...EDF2014: Paul Groth, Department of Computer Science & The Network Institute, ...
EDF2014: Paul Groth, Department of Computer Science & The Network Institute, ...
 

Más de Michel Dumontier

A metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsA metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsMichel Dumontier
 
Data-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsData-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsMichel Dumontier
 
The Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health SystemThe Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health SystemMichel Dumontier
 
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...Michel Dumontier
 
The role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health SystemThe role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health SystemMichel Dumontier
 
Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Michel Dumontier
 
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Michel Dumontier
 
Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?Michel Dumontier
 
The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...Michel Dumontier
 
Keynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerKeynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerMichel Dumontier
 
The future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureThe future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureMichel Dumontier
 
Developing and assessing FAIR digital resources
Developing and assessing FAIR digital resourcesDeveloping and assessing FAIR digital resources
Developing and assessing FAIR digital resourcesMichel Dumontier
 
Advancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIRAdvancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIRMichel Dumontier
 
A Framework to develop the FAIR Metrics
A Framework to develop the FAIR MetricsA Framework to develop the FAIR Metrics
A Framework to develop the FAIR MetricsMichel Dumontier
 
FAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationFAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationMichel Dumontier
 
Towards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessTowards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessMichel Dumontier
 

Más de Michel Dumontier (20)

A metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsA metadata standard for Knowledge Graphs
A metadata standard for Knowledge Graphs
 
Data-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsData-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge Graphs
 
Evaluating FAIRness
Evaluating FAIRnessEvaluating FAIRness
Evaluating FAIRness
 
The Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health SystemThe Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health System
 
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
 
The role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health SystemThe role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health System
 
Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...
 
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
 
Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?
 
The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...
 
Keynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerKeynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University Dinner
 
The future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureThe future of science and business - a UM Star Lecture
The future of science and business - a UM Star Lecture
 
Are we FAIR yet?
Are we FAIR yet?Are we FAIR yet?
Are we FAIR yet?
 
Developing and assessing FAIR digital resources
Developing and assessing FAIR digital resourcesDeveloping and assessing FAIR digital resources
Developing and assessing FAIR digital resources
 
Advancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIRAdvancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIR
 
A Framework to develop the FAIR Metrics
A Framework to develop the FAIR MetricsA Framework to develop the FAIR Metrics
A Framework to develop the FAIR Metrics
 
FAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationFAIR principles and metrics for evaluation
FAIR principles and metrics for evaluation
 
Towards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessTowards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRness
 
Data Science for the Win
Data Science for the WinData Science for the Win
Data Science for the Win
 
2016 bmdid-mappings
2016 bmdid-mappings2016 bmdid-mappings
2016 bmdid-mappings
 

Último

Hemodialysis: Chapter 1, Physiological Principles of Hemodialysis - Dr.Gawad
Hemodialysis: Chapter 1, Physiological Principles of Hemodialysis - Dr.GawadHemodialysis: Chapter 1, Physiological Principles of Hemodialysis - Dr.Gawad
Hemodialysis: Chapter 1, Physiological Principles of Hemodialysis - Dr.GawadNephroTube - Dr.Gawad
 
Our Hottest 💘 Surat ℂall Girls Serviℂe 💘Pasodara📱 8527049040📱450+ ℂall Girl C...
Our Hottest 💘 Surat ℂall Girls Serviℂe 💘Pasodara📱 8527049040📱450+ ℂall Girl C...Our Hottest 💘 Surat ℂall Girls Serviℂe 💘Pasodara📱 8527049040📱450+ ℂall Girl C...
Our Hottest 💘 Surat ℂall Girls Serviℂe 💘Pasodara📱 8527049040📱450+ ℂall Girl C...Aditi Pandey i11
 
Cervical screening – taking care of your health flipchart (Vietnamese)
Cervical screening – taking care of your health flipchart (Vietnamese)Cervical screening – taking care of your health flipchart (Vietnamese)
Cervical screening – taking care of your health flipchart (Vietnamese)Cancer Institute NSW
 
SEMESTER-V CHILD HEALTH NURSING-UNIT-1-INTRODUCTION.pdf
SEMESTER-V CHILD HEALTH NURSING-UNIT-1-INTRODUCTION.pdfSEMESTER-V CHILD HEALTH NURSING-UNIT-1-INTRODUCTION.pdf
SEMESTER-V CHILD HEALTH NURSING-UNIT-1-INTRODUCTION.pdfSachin Sharma
 
HIFI* ℂall Girls In Thane West Phone 🔝 9920874524 🔝 💃 Me All Time Serviℂe Ava...
HIFI* ℂall Girls In Thane West Phone 🔝 9920874524 🔝 💃 Me All Time Serviℂe Ava...HIFI* ℂall Girls In Thane West Phone 🔝 9920874524 🔝 💃 Me All Time Serviℂe Ava...
HIFI* ℂall Girls In Thane West Phone 🔝 9920874524 🔝 💃 Me All Time Serviℂe Ava...Ishita Kashyap
 
Cas 28578-16-7 PMK ethyl glycidate ( new PMK powder) best suppler
Cas 28578-16-7 PMK ethyl glycidate ( new PMK powder) best supplerCas 28578-16-7 PMK ethyl glycidate ( new PMK powder) best suppler
Cas 28578-16-7 PMK ethyl glycidate ( new PMK powder) best supplerSherrylee83
 
CAS 110-63-4 BDO Liquid 1,4-Butanediol 1 4 BDO Warehouse Supply For Excellent...
CAS 110-63-4 BDO Liquid 1,4-Butanediol 1 4 BDO Warehouse Supply For Excellent...CAS 110-63-4 BDO Liquid 1,4-Butanediol 1 4 BDO Warehouse Supply For Excellent...
CAS 110-63-4 BDO Liquid 1,4-Butanediol 1 4 BDO Warehouse Supply For Excellent...ocean4396
 
TEST BANK for The Nursing Assistant Acute, Subacute, and Long-Term Care, 6th ...
TEST BANK for The Nursing Assistant Acute, Subacute, and Long-Term Care, 6th ...TEST BANK for The Nursing Assistant Acute, Subacute, and Long-Term Care, 6th ...
TEST BANK for The Nursing Assistant Acute, Subacute, and Long-Term Care, 6th ...marcuskenyatta275
 
World Hypertension Day 17th may 2024 ppt
World Hypertension Day 17th may 2024 pptWorld Hypertension Day 17th may 2024 ppt
World Hypertension Day 17th may 2024 pptdesktoppc
 
Denture base resins materials and its mechanism of action
Denture base resins materials and its mechanism of actionDenture base resins materials and its mechanism of action
Denture base resins materials and its mechanism of actionDr.shiva sai vemula
 
ANAPHYLAXIS BY DR.SOHAN BISWAS,MBBS,DNB(INTERNAL MEDICINE) RESIDENT.pptx
ANAPHYLAXIS BY DR.SOHAN BISWAS,MBBS,DNB(INTERNAL MEDICINE) RESIDENT.pptxANAPHYLAXIS BY DR.SOHAN BISWAS,MBBS,DNB(INTERNAL MEDICINE) RESIDENT.pptx
ANAPHYLAXIS BY DR.SOHAN BISWAS,MBBS,DNB(INTERNAL MEDICINE) RESIDENT.pptxDr. Sohan Biswas
 
5CL-ADB powder supplier 5cl adb 5cladba 5cl raw materials vendor on sale now
5CL-ADB powder supplier 5cl adb 5cladba 5cl raw materials vendor on sale now5CL-ADB powder supplier 5cl adb 5cladba 5cl raw materials vendor on sale now
5CL-ADB powder supplier 5cl adb 5cladba 5cl raw materials vendor on sale nowSherrylee83
 
TEST BANK For Lewis's Medical Surgical Nursing in Canada, 4th Edition by Jane...
TEST BANK For Lewis's Medical Surgical Nursing in Canada, 4th Edition by Jane...TEST BANK For Lewis's Medical Surgical Nursing in Canada, 4th Edition by Jane...
TEST BANK For Lewis's Medical Surgical Nursing in Canada, 4th Edition by Jane...marcuskenyatta275
 
DR. Neha Mehta Best Psychologist.in India
DR. Neha Mehta Best Psychologist.in IndiaDR. Neha Mehta Best Psychologist.in India
DR. Neha Mehta Best Psychologist.in IndiaNehamehta128467
 
5Cladba ADBB 5cladba buy 6cl adbb powder 5cl ADBB precursor materials
5Cladba ADBB 5cladba buy 6cl adbb powder 5cl ADBB precursor materials5Cladba ADBB 5cladba buy 6cl adbb powder 5cl ADBB precursor materials
5Cladba ADBB 5cladba buy 6cl adbb powder 5cl ADBB precursor materialsSherrylee83
 
Unveiling Alcohol Withdrawal Syndrome: exploring it's hidden depths
Unveiling Alcohol Withdrawal Syndrome: exploring it's hidden depthsUnveiling Alcohol Withdrawal Syndrome: exploring it's hidden depths
Unveiling Alcohol Withdrawal Syndrome: exploring it's hidden depthsYash Garg
 
Tips and tricks to pass the cardiovascular station for PACES exam
Tips and tricks to pass the cardiovascular station for PACES examTips and tricks to pass the cardiovascular station for PACES exam
Tips and tricks to pass the cardiovascular station for PACES examJunhao Koh
 
VVIP Yelahanka ℂall Girls 6350482085 Heat-immolating { Bangalore } Coveted Gi...
VVIP Yelahanka ℂall Girls 6350482085 Heat-immolating { Bangalore } Coveted Gi...VVIP Yelahanka ℂall Girls 6350482085 Heat-immolating { Bangalore } Coveted Gi...
VVIP Yelahanka ℂall Girls 6350482085 Heat-immolating { Bangalore } Coveted Gi...janusa9823#S0007
 
Cardiovascular Physiology - Regulation of Cardiac Pumping
Cardiovascular Physiology - Regulation of Cardiac PumpingCardiovascular Physiology - Regulation of Cardiac Pumping
Cardiovascular Physiology - Regulation of Cardiac PumpingMedicoseAcademics
 
Is Rheumatoid Arthritis a Metabolic Disorder.pptx
Is Rheumatoid Arthritis a Metabolic Disorder.pptxIs Rheumatoid Arthritis a Metabolic Disorder.pptx
Is Rheumatoid Arthritis a Metabolic Disorder.pptxSamar Tharwat
 

Último (20)

Hemodialysis: Chapter 1, Physiological Principles of Hemodialysis - Dr.Gawad
Hemodialysis: Chapter 1, Physiological Principles of Hemodialysis - Dr.GawadHemodialysis: Chapter 1, Physiological Principles of Hemodialysis - Dr.Gawad
Hemodialysis: Chapter 1, Physiological Principles of Hemodialysis - Dr.Gawad
 
Our Hottest 💘 Surat ℂall Girls Serviℂe 💘Pasodara📱 8527049040📱450+ ℂall Girl C...
Our Hottest 💘 Surat ℂall Girls Serviℂe 💘Pasodara📱 8527049040📱450+ ℂall Girl C...Our Hottest 💘 Surat ℂall Girls Serviℂe 💘Pasodara📱 8527049040📱450+ ℂall Girl C...
Our Hottest 💘 Surat ℂall Girls Serviℂe 💘Pasodara📱 8527049040📱450+ ℂall Girl C...
 
Cervical screening – taking care of your health flipchart (Vietnamese)
Cervical screening – taking care of your health flipchart (Vietnamese)Cervical screening – taking care of your health flipchart (Vietnamese)
Cervical screening – taking care of your health flipchart (Vietnamese)
 
SEMESTER-V CHILD HEALTH NURSING-UNIT-1-INTRODUCTION.pdf
SEMESTER-V CHILD HEALTH NURSING-UNIT-1-INTRODUCTION.pdfSEMESTER-V CHILD HEALTH NURSING-UNIT-1-INTRODUCTION.pdf
SEMESTER-V CHILD HEALTH NURSING-UNIT-1-INTRODUCTION.pdf
 
HIFI* ℂall Girls In Thane West Phone 🔝 9920874524 🔝 💃 Me All Time Serviℂe Ava...
HIFI* ℂall Girls In Thane West Phone 🔝 9920874524 🔝 💃 Me All Time Serviℂe Ava...HIFI* ℂall Girls In Thane West Phone 🔝 9920874524 🔝 💃 Me All Time Serviℂe Ava...
HIFI* ℂall Girls In Thane West Phone 🔝 9920874524 🔝 💃 Me All Time Serviℂe Ava...
 
Cas 28578-16-7 PMK ethyl glycidate ( new PMK powder) best suppler
Cas 28578-16-7 PMK ethyl glycidate ( new PMK powder) best supplerCas 28578-16-7 PMK ethyl glycidate ( new PMK powder) best suppler
Cas 28578-16-7 PMK ethyl glycidate ( new PMK powder) best suppler
 
CAS 110-63-4 BDO Liquid 1,4-Butanediol 1 4 BDO Warehouse Supply For Excellent...
CAS 110-63-4 BDO Liquid 1,4-Butanediol 1 4 BDO Warehouse Supply For Excellent...CAS 110-63-4 BDO Liquid 1,4-Butanediol 1 4 BDO Warehouse Supply For Excellent...
CAS 110-63-4 BDO Liquid 1,4-Butanediol 1 4 BDO Warehouse Supply For Excellent...
 
TEST BANK for The Nursing Assistant Acute, Subacute, and Long-Term Care, 6th ...
TEST BANK for The Nursing Assistant Acute, Subacute, and Long-Term Care, 6th ...TEST BANK for The Nursing Assistant Acute, Subacute, and Long-Term Care, 6th ...
TEST BANK for The Nursing Assistant Acute, Subacute, and Long-Term Care, 6th ...
 
World Hypertension Day 17th may 2024 ppt
World Hypertension Day 17th may 2024 pptWorld Hypertension Day 17th may 2024 ppt
World Hypertension Day 17th may 2024 ppt
 
Denture base resins materials and its mechanism of action
Denture base resins materials and its mechanism of actionDenture base resins materials and its mechanism of action
Denture base resins materials and its mechanism of action
 
ANAPHYLAXIS BY DR.SOHAN BISWAS,MBBS,DNB(INTERNAL MEDICINE) RESIDENT.pptx
ANAPHYLAXIS BY DR.SOHAN BISWAS,MBBS,DNB(INTERNAL MEDICINE) RESIDENT.pptxANAPHYLAXIS BY DR.SOHAN BISWAS,MBBS,DNB(INTERNAL MEDICINE) RESIDENT.pptx
ANAPHYLAXIS BY DR.SOHAN BISWAS,MBBS,DNB(INTERNAL MEDICINE) RESIDENT.pptx
 
5CL-ADB powder supplier 5cl adb 5cladba 5cl raw materials vendor on sale now
5CL-ADB powder supplier 5cl adb 5cladba 5cl raw materials vendor on sale now5CL-ADB powder supplier 5cl adb 5cladba 5cl raw materials vendor on sale now
5CL-ADB powder supplier 5cl adb 5cladba 5cl raw materials vendor on sale now
 
TEST BANK For Lewis's Medical Surgical Nursing in Canada, 4th Edition by Jane...
TEST BANK For Lewis's Medical Surgical Nursing in Canada, 4th Edition by Jane...TEST BANK For Lewis's Medical Surgical Nursing in Canada, 4th Edition by Jane...
TEST BANK For Lewis's Medical Surgical Nursing in Canada, 4th Edition by Jane...
 
DR. Neha Mehta Best Psychologist.in India
DR. Neha Mehta Best Psychologist.in IndiaDR. Neha Mehta Best Psychologist.in India
DR. Neha Mehta Best Psychologist.in India
 
5Cladba ADBB 5cladba buy 6cl adbb powder 5cl ADBB precursor materials
5Cladba ADBB 5cladba buy 6cl adbb powder 5cl ADBB precursor materials5Cladba ADBB 5cladba buy 6cl adbb powder 5cl ADBB precursor materials
5Cladba ADBB 5cladba buy 6cl adbb powder 5cl ADBB precursor materials
 
Unveiling Alcohol Withdrawal Syndrome: exploring it's hidden depths
Unveiling Alcohol Withdrawal Syndrome: exploring it's hidden depthsUnveiling Alcohol Withdrawal Syndrome: exploring it's hidden depths
Unveiling Alcohol Withdrawal Syndrome: exploring it's hidden depths
 
Tips and tricks to pass the cardiovascular station for PACES exam
Tips and tricks to pass the cardiovascular station for PACES examTips and tricks to pass the cardiovascular station for PACES exam
Tips and tricks to pass the cardiovascular station for PACES exam
 
VVIP Yelahanka ℂall Girls 6350482085 Heat-immolating { Bangalore } Coveted Gi...
VVIP Yelahanka ℂall Girls 6350482085 Heat-immolating { Bangalore } Coveted Gi...VVIP Yelahanka ℂall Girls 6350482085 Heat-immolating { Bangalore } Coveted Gi...
VVIP Yelahanka ℂall Girls 6350482085 Heat-immolating { Bangalore } Coveted Gi...
 
Cardiovascular Physiology - Regulation of Cardiac Pumping
Cardiovascular Physiology - Regulation of Cardiac PumpingCardiovascular Physiology - Regulation of Cardiac Pumping
Cardiovascular Physiology - Regulation of Cardiac Pumping
 
Is Rheumatoid Arthritis a Metabolic Disorder.pptx
Is Rheumatoid Arthritis a Metabolic Disorder.pptxIs Rheumatoid Arthritis a Metabolic Disorder.pptx
Is Rheumatoid Arthritis a Metabolic Disorder.pptx
 

Hazard Estimation and Method Comparison with OWL-Encoded Toxicity Decision Trees

  • 1. Hazard Estimation and Method Comparison with OWL-Encoded Toxicity Decision TreesLeonid L. Chepelev, Dana Klassen, and Michel DumontierDepartment of Biology, Institute of Biochemistry, School of Computer ScienceCarleton University Ottawa, Canada An OWLED 2011 Paper
  • 2. Motivation Machine learning approaches such as decision trees are commonly used in toxicity prediction However, interpretation of complex trees can be difficult to interpret, and there is no explanation for the category obtained. Moreover, many variant decision trees are coming out, difficult to compare Can we use OWL ontologies to formally represent and compare decision trees? A simple toxicity decision tree: at each branching point, a rule is evaluated, and based on the outcome of this rule, either a final activity decision is made, or judgment is deferred to another node. 2
  • 3. Druglikeness: Lipinski’s Rule of Five Rule of thumb for druglikeness (orally active in humans) (4 rules with multiples of 5) mass of 500 Daltons or less 5 hydrogen bond donors or less 10 hydrogen bond acceptors or less A partition coefficient (logP) value between -5 and 5 Multiple conditions that must be satisfied to be considered druglike. A molecule must failing any of these would not be drug like. 3
  • 4. Chemical Data Lipinski drug-likeness dataset comprised of 7000 compounds from the Human Metabolome Database (HMDB). attributes computed using the Chemistry Development Kit. Tree built with open source Weka - collection of machine learning algorithms for data mining tools for data pre-processing, classification, regression, clustering, association rules, and visualization. 4
  • 5. Rule of Five Decision Tree Correctly classified molecule counts are given in brackets. 100% accuracy in ten-fold cross validation. 5
  • 6. Formalization Substance I subClassOf Substance II Substance III A substance I is something that has a molecular weight Substance II is a kind of substance I that has a molecular weight <= 500 Substance III is a kind of substance I that has a molecular weight > 500 6
  • 7. Formalization subClassOf Substance I I Substance I has attribute has attribute Molecular Weight Molecular Weight has value Every node in the decision tree represents an entity having a attribute or feature, whose value may be specified substance I is something that has a molecular weight ‘substance I’ equivalentClass ‘has attribute’ some ‘molecular weight’ substance II is a kind of substance I with a specified ‘substance II’ equivalentClass ‘substance I’ and ‘has attribute’ some (‘molecular weight’ and ‘has value’ double[<= 499.296759])) >499.296759 7
  • 8. The Chemical Information Ontology (CHEMINF) 100+ chemical descriptors 50+ chemical qualities Relates descriptors to their specifications, the software that generated them (along with the running parameters, and the algorithms that they implement) Contributors: Nico Adams, Leonid Chepelev, Michel Dumontier, Janna Hastings, EgonWillighagen, Peter Murray-Rust, Cristoph Steinbeck 8 http://semanticchemistry.googlecode.com
  • 9. A simple decision tree can be represented as a set of subsuming OWL classes Methods: A WEKA tree was trained and serialized into dot format. Used the Weka API to read the document and create the ontology using the OWL API. 9
  • 10. Each outcome may also be formalized in terms of the set of all attributes as obtained by drawing a path to the root Druglike-moleculeequivalentClass ‘molecule’ and ‘has attribute’ some (‘molecular weight’ that ‘has value’ double[<= 500.0]) and ‘has attribute’ some (‘hydrogen bond count donor count’ that ‘has value’ int[<= 5]) and ‘has attribute’ some (‘hydrogen bond acceptor count’ that ‘has value’ int[<= 10]) and ‘has attribute’ some (‘partition coefficient’ that ‘has value’ double[<= 5.0, >= -5.0]) 10
  • 11. Large scale decision trees Lipinski example is typically trivial Can we create a new decision tree capable of classification of linked data Obtained 1400 chemicals from an EPA ToxCast carcinogenic toxicity dataset labelled either toxic or non-toxic Computed 318 boolean features using the ToxTree API. http://toxtree.sourceforge.net/ Generated the decision tree using Weka Generated the OWL ontology using the OWL API Generated individuals using the CHESS specification and used descriptors specified in the CHEMINF ontology. Classification using OWL API + Pellet; Protégé 4 and Hermit. 11
  • 12. A decision tree to predict carcinogenic toxicity 12
  • 13. Decision Tree to OWL Ontology 13
  • 15. From data to automated reasoning data linked data Automated Reasoning (realization) over OWL encoded Toxicity tree 15
  • 16. 16
  • 17. 17
  • 18. Path through Decision Tree kindly provided by reasoning about the OWL ontology 18
  • 19. Comparison of toxicity trees Along with the standard lipinski rule of five ontology, we generated a variant where MW <= 250. Reasoning over the two ontologies, we see that the active compound (based on the MW <= 250) is subsumed by the active compound based on MW <= 500 19
  • 20. Conclusion Decision trees can be faithfully represented as OWL ontologies As formalized ontologies, we can automatically reason about the ontology, and use it to classify new chemicals (hence predict toxicity) If we maintain the structure of the decision tree, we can get explanations to provide the set of attributes used in the decision making (unlike black box counterpart). Expectation that trees generated with different, but aligned vocabularies may now be comparable 20
  • 21. Acknowledgements CHEMINF Group Leo Chepelev Janna Hastings EgonWillighagen Nico Adams Toxicity Group Leo Chepelev Dana Klassen 21
  • 22. dumontierlab.com michel_dumontier@carleton.ca Presentations: http://slideshare.com/micheldumontier 22