College Call Girls Pune Mira 9907093804 Short 1500 Night 6000 Best call girls...
HyQue: Evaluating scientific Hypotheses using semantic web technologies
1. HyQue: Evaluating scientific Hypotheses using semantic web technologies Michel Dumontier, PhD Associate professor of bioinformatics, department of biology, institute of biochemistry and school of computer science @ carleton university Professeur Associé, Département d’informatique et de génielogiciel, Université Laval
2. HyQue is a collaborative WORK Work performed by Alison Callahan, a PhD student under my supervision @ Carleton University Partnership with Dr. Nigam Shah, Assistant Professor at Stanford University
5. with unparalleled growth in research outputs, Uncovering all the evidence to support/refute a hypothesis is becoming increasingly difficult Citations added to Medline 1995-2009 Source:http://www.nlm.nih.gov/bsd/stats/cit_added.html
12. incremental hypothesis improvement[1] Racunas S. A., Shah N. H., Albert I. and Fedoroff N. V. (2004). HyBrow: A prototype system for computer-aided hypothesis evaluation. Bioinformatics 20(S. 1): i1-i8.
23. Hypothesis h1: e1 (Gal4p induces expression of GAL1) h2: e2 (Gal3p induces expression of GAL2 e3AND Gal4p induces expression of GAL7) h3: e4 (Gal4p induces expression of GAL7 e5AND Gal80p inhibits production of Gal4p when GAL3 is over-expressed e6 AND Gal80p induces expression of GAL7) simple event-based expression conjunctive hypothesis – must satisfy two expressions conjunctive hypothesis with conditional expression
28. Bio2RDF is part of a growing web of linked data “Linking Open Data clouddiagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
29. The Semantic Web is a web of knowledge It is about standards for publishing, sharing and querying knowledge drawn from diverse sources It enables the answering of sophisticated questions
30. ontology as a strategy to formally represent knowledge
31. The Web Ontology Language (OWL) Has Explicit Semantics Can therefore be used to capture knowledge in a machine understandable way
41. HyQue propositions only specify eventsHyQuehypothesis ≡ ‘proposition’ that ‘specifies’ only `event’) HyQuehypothesis ≡ ‘proposition’ that `has component part’ only (`proposition’ that ‘specifies’ only `event’)
43. HyQue events Events are composed of conditional assertions on a relation between ‘actor’ and ‘target’ induces(agent, target, context, location) For decidable logic (OWL), an n-ary object is used Event ‘has agent’ agent ‘has target’ target ‘has context’ context ‘is located in’ location
44. All data are represented using Rdf hypothesis RDF’s basic representation unit is the “triple” <subject> <predicate> <object> :h rdf:typehyque:Hypothesis . :h hyque:has-component-part :p1 . :p1 rdf:typehyque:Proposition . has component part proposition specifies event: gal4p positively regulates the expression of GAL1
45. All data are represented using Rdf :h a hyque:Hypothesis; hyque:specifies :e1 . :e1 a <http://bio2rdf.org/go:0010628> <!– positive regulation of gene expression --> hyque:is_negated"0"; hyque:agent<http://bio2rdf.org/sgd:Gal4p> ; hyque:target<http://bio2rdf.org/sgd:GAL1> ; …. hypothesis specifies event: gal4p positively regulates the expression of GAL1
49. Query results evaluated based on rule sets ‘induce’ rule (maximum score: 5): Is event negated? If yes, subtract 2 Is logical operator ‘induce’? If yes, add 1; if no, subtract 1 Is agent of type ‘protein’ or ‘RNA’? If yes, add 1; if of type ‘gene’, subtract 1 Is target of type ‘gene’? If yes, add 1; if no, subtract 1 Does agent have known ‘transcription factor activity’? If yes, add 1 Is event located in the ‘nucleus’? If yes, add 1; if no, subtract 1 GO:0010628 CHEBI:36080 SO:0000236 GO:0003700 GO:0005634
55. Event negated in published literature: no -> 0Thus, the e1 event obtains 4 out of a maximum of 5 points, and receives a score of 0.8.
56. Evaluating hypotheses Events e2, e3, and e4 are also ‘induce’ events and are evaluated using the ‘induce’ rule set, each obtaining a score of 0.8. e5 is undecidable- no data to support that Gal80p inhibits Gal4p when GAL3 is over-expressed in HKB -> third entire event set is deemed undecidable. Overall hypothesis score selected from e1 (0.8), e2 + e3 (0.8+0.8=1.6) Final hypothesis score is 1.6 + events e2 + e3 have the strongest experimental support. e1 (Gal4p induces expression of GAL1) OR e2 (Gal3p induces expression of GAL2 e3AND Gal4p induces expression of GAL7) OR e4 (Gal4p induces expression of GAL7 e5AND Gal80p inhibits production of Gal4p when GAL3 is over-expressed e6 AND Gal80p induces expression of GAL7)
59. The Semantic Automated Discovery and Integration (SADI) framework makes it easy to create Semantic Web services using OWL classes as service inputs and outputs http://sadiframework.org Users can post a hypothesis in RDF and receive the hypothesis evaluation RDF Mark Wilkinson, UBC Michel Dumontier, Carleton University Christopher Baker, UNB HyQue can become part of a workflow for investigations
60.
61. Expand beyond the GAL network with network reconstructions and NLP facilitated data curation
62.
63. Acknowledgements Alison Callahan (developing HyQue) Nigam Shah (key collaborator) Stephen Racunas and Amar Das for helpful discussions Bio2RDF: Peter Ansell, Francois Belleau, Allison Callahan, Jacques Corbeil, Jose Cruz-Toledo, Alex De Leon, Steve Etlinger, James Hogan, Nichealla Keith, Jean Morissette, Marc-Alexandre Nolin, Nicole Tourigny, Philippe Rigault and, Paul Roe SADI: Christopher Baker, Melanie Courtot, Jose Cruz-Toledo, Steve Etlinger, Nichealla Keith, Artjom Klein, Luke McCarthy, SilvanePaixao, Ben Vandervalk, Natalia Villanueva-Rosales, Mark Wilkinson
64. dumontierlab.com michel_dumontier@carleton.ca
Notas del editor
HyQue is a hypothesis-based query and evaluation tool
So we see the scientist asking a question …. [NEXT SLIDE]
… which they will test by developing hypotheses, carry out experiments to test these hypotheses, and then use the results to refine their research. What is missing from this representation is the interaction of the individual scientist with their community of scientists, sharing data and using the results of other’s experiments to inform their own work.What is a hypothesis?“a supposition or proposed explanation made on the basis of limited evidence as a starting point for further investigation” (OED)(Ideally) scientists revisit hypotheses as their research progressesUse new knowledge to incrementally improve hypotheses over time
Scientists are faced with an exponentially increasing amount of biological data on the Web and in scientific papers it is possible that information and knowledge supporting or refuting a given hypothesis already exists finding it becomes the challengeAll of this information is available in different formats as wellWe, as individuals, cannot accurately evaluate hypotheses in this context and at a scale consistent with the coverage of relevant information resources we need to develop methods leveraging computational power and reasoning to evaluate hypotheses based on large amounts of existing background informationCan we accurately evaluate hypotheses in this context and at a scale consistent with the coverage of relevant information resources?
The Hypothesis Browser HyBrow is one research project motivated by this problemHyBrow emphasizes consistency checking of hypotheses both internally (do all of the pieces of the hypothesis logically fit?) and with respect to external information – is this hypothesis consistent with what we know?Constraints – forbidden entity types or events or locations in a domain, for exampleRules – judgments of support or conflict given a set of factsFormulates hypotheses as composed of events, which involve entities interacting under a set of specified conditions
the GAL gene network in yeast Gal3p, gal4p and gal80p have transcriptional control over the transport gene, the enzymes and their own genes
We represent a hypothesis as a collection of propositions
Binary relations are insufficient
This is part of a hypothesis represented in N3 and used as input to HyQueNote: Binding between galactose and Gal3p does not return any results; there IS binding between Gal3p and Gal80p
A SPARQL construct statement is used to generate an RDF representation of the relevant results, which is passed to HyQue, parsed using ARC2 and evaluating using event specific rules
The RDF representing the evaluation of the input hypothesis is linked to both the hypothesis AND the data used to evaluate the hypothesis
This is a screenshot of some HyQue data in Virtuoso, a triple store system that we use to store and access RDF