call girls in green park DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
Enabling faster analysis of vaccine adverse event reports with ontology support
1. ENABLING FASTER ANALYSIS
OF VACCINE ADVERSE EVENT
REPORTS WITH ONTOLOGY
SUPPORT
Mélanie Courtot, Ph.D. candidate, Brinkman lab
Knowledge Translation Seminar, March 21st 2013
2. Outline
• Problem statement and significance
• The Adverse Event Reporting Ontology (AERO) for
adverse event reports analysis
• Clinical standard
• Logical encoding
• Classified dataset
• Classification using MedDRA annotations and text mining
• AERO for data integration
• The Semantic Web
• VAERS as linked data
3. Problem statement
• Importance of monitoring adverse event
• Long term effects, various demographics
• Detection of abnormal events in population leads to withdrawals etc
• Current adverse events following immunization (AEFIs)
reporting systems use different standards (if any) to
encode reports
• The resultant lack of consistency limits the ability to query
and assess potential safety issues
• Reports are manually assessed: time and money consuming
• Inability to assess all reports carefully
4. Goal and significance of my work
• Goal: Improve safety signal detection in vaccine
AEFIs reports
• Step 1: Augment existing standards with logically formalized
elements
• Step 2: Perform automatic case classification
• Step 3: Test classification utility to detect safety signals
• Significance: Increase the timeliness and cost
effectiveness of reliable adverse event signal
detection
5. 4 steps to automated classification
1. We agree on a standard to
describe adverse events
2. We encode that standard in
a computer amenable
format
3. We map the clinical
standard to current adverse
events annotations
4. We classify reports of
adverse events according to
established guidelines
Clinical
standard
Logical
encoding
Classified
dataset
Classified
Reports
6. Existing standard: the Brighton
collaboration
• https://brightoncollaboration.org
• Provides case definitions and guidelines to standardize
reporting
• Well established network (adopted as standard in Canada
2009)
• Benefits of working with Brighton:
• Existing software tool
• Extensive network of collaborators, shared vision
Clinical
standard
8. Strategy for encoding adverse event
reports
• Model the domain using an ontology
• Ontologies typically have two distinct components:
• Names for important concepts in the domain
• Prokaryotic cells
• Eukaryotic cells
• Background knowledge/constraints on the domain
• Nothing can be a prokaryotic and an eukaryotic
cell
Logical
encoding
9. Strategy for encoding adverse event
reports
• Ontology encoded using the Web Ontology
Language (OWL 2)
• Open Biological and Biomedical Ontology
Foundry helps with quality, interoperability and
avoiding redundant work
• More than >100 biomedical ontologies in the suite,
e.g., Gene Ontology (GO)
• Reuse of resources (ontologies and tools)
Logical
encoding
10. Reasoning is critical
• Prokaryotic and
Eukaryotic cell are
declared disjoints
• Fungal cell is a
Eukaryotic cell
• Spore is a Fungal cell
and a Prokaryotic cell
=> inconsistency
doi:10.1371/journal.pone.0022006.g003
11. Clinical guideline in AERO
• Goal: provide a pattern to encode adverse event following
immunization guidelines
• This pattern should be applicable to any type of clinical
guideline
• Enable the reports to be annotated with diagnosis
according to a specific guideline (and keep track of
what it is)
• We want to:
• Encode the guideline in OWL
• Be able to infer correct classification (i.e., perform accurate
diagnosis)
Logical
encoding
13. Current status
• Pattern implemented in the OWL file for
anaphylaxis
• Has been successfully used to
model the WHO malaria clinical guidelines
• Paper submitted (yay )
• Need to add other guidelines
Jie Zheng
Upenn
Logical
encoding
14. VAERS dataset
• VAERS = Vaccine Adverse Event Reporting System
• Depends on the Centers for Disease Control and
Prevention (CDC) and the Food and Drug Administration
(FDA) in the United States
• Spontaneous reporting system
• Issues with underreporting, quality of reporting
• Uses MedDRA annotations (Medical Dictionary of
regulatory Activities)
Classified
dataset
16. Classified VAERS data
• Unclassified files available publicly
• Classified dataset available upon request (in this case
H1N1 dataset)
• Cleanup
• No default NULL value: “none”, “null”, “”…
• Multiple languages: encoding issue with Spanish
• 5 MedDRA terms per report, or duplicates
• Pre-processing required
• Load into database
• Match to public records
Classified
dataset
17. Classification using MedDRA annotations
• Goal is to map the current Brighton terms in
AERO to their MedDRA counterpart
• Then try and classify the MedDRA-annotated
reports using the Brighton criteria
• Compare that with classification done by medical
experts
Classified
Reports
18. Mapping to MedDRA
• Translate, as best possible, MedDRA annotations
to Brighton symptoms
• Import selected MedDRA terms in to OWL, following
general strategy of Minimal Information to Reference an
External Ontology Terms (Courtot, et al. 2011)
• Standardized MedDRA Queries provide useful
documentation on how to interpret MedDRA
• OWL used to define Brighton symptoms in terms of
MedDRA terms (this will be only approximate)
Classified
Reports
19. Classification using text
• In collaboration with Seeker Solutions, a Victoria based
company
• Goal is to use text part of the reports to classify them
• Process:
• Training data: a set of reports that have been manually classified
• Machine learning algorithm learns pattern leading to correct
classification
• The model is applied to new testing data
• 2 types of classification tested:
• Likelihood
• Topic modeling
Classified
Reports
22. Current status
• Testing classification
with the MedDRA terms
• Need to work on the
MedDRA mapping
• Test classification with
AERO (and compare
with the one with
MedDRA)
• Refine text classification
• Using the ontology to guide
clustering
• Using Canadian dataset
24. The semantic web
• From a web of documents to a web of data
• HTML pages can’t be understood by machines; humans
have to manually follow hyperlinks
• Semantic web uses standard for data representation,
querying, vocabularies to link data behind the scenes
• Use of Uniform Resources Identifiers (URIs) and
Resource Description Framework (RDF)
25. RDF and URIs
• RDF: a language used to represent information about
resources on the web
• RDF statement: subject, predicate, object
• URI: unique identifiers for things
• http://purl.obolibrary.org/obo/AERO_0000244: major dermatological
criterion for anaphylaxis according to Brighton
generalized
rash
major
dermatological
criterion for
anaphylaxis
is_a
26. Linked Open Data cloud
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”
27.
28. VAERS as linked data
• Transform the VAERS dataset in RDF to enable better
integration with existing resources
• No need to worry about resources’ structure (CSV,
databases, XML)
• Each report is an instance of a VAERS report
• System will also provide technical infrastructure to test
classification
• RDF automatically generated from the database
containing VAERS data
31. Querying across linked data
• URIs(or mappings between URIs) to link different
resources
• Querying on the VAERS dataset
• E.g., are there difference in the type of adverse events between a live
attenuated flu vaccine and a trivalent inactivated one?
• Querying across multiple datasets
• Identify drugs in text (e.g. Benadryl) and infer they are anti-allergic
agents via DrugBank
32. Example: link state code in VAERS to state
info in DBPedia, pass result to Google
visualization API
33. Acknowledgements
• Alan Ruttenberg, Ryan Brinkman
• Oliver He, Yu Lin, Lindsay Cowell, Barry Smith, Ryan
Brinkman, Peter d’Eustachio, Albert Goldfain
• Julie Lafleche, Lauren McDonald, Robert Pless,
Barbara Law, Jan Bonhoeffer, Jean-Paul Collet
• Brinkman lab