Enabling faster analysis of vaccine adverse event reports with ontology support

ENABLING FASTER ANALYSIS
OF VACCINE ADVERSE EVENT
REPORTS WITH ONTOLOGY
SUPPORT
Mélanie Courtot, Ph.D. candidate, Brinkman lab
Knowledge Translation Seminar, March 21st 2013

Outline
• Problem statement and significance
• The Adverse Event Reporting Ontology (AERO) for
adverse event reports analysis
• Clinical standard
• Logical encoding
• Classified dataset
• Classification using MedDRA annotations and text mining
• AERO for data integration
• The Semantic Web
• VAERS as linked data

Problem statement
• Importance of monitoring adverse event
• Long term effects, various demographics
• Detection of abnormal events in population leads to withdrawals etc
• Current adverse events following immunization (AEFIs)
reporting systems use different standards (if any) to
encode reports
• The resultant lack of consistency limits the ability to query
and assess potential safety issues
• Reports are manually assessed: time and money consuming
• Inability to assess all reports carefully

Goal and significance of my work
• Goal: Improve safety signal detection in vaccine
AEFIs reports
• Step 1: Augment existing standards with logically formalized
elements
• Step 2: Perform automatic case classification
• Step 3: Test classification utility to detect safety signals
• Significance: Increase the timeliness and cost
effectiveness of reliable adverse event signal
detection

4 steps to automated classification
1. We agree on a standard to
describe adverse events
2. We encode that standard in
a computer amenable
format
3. We map the clinical
standard to current adverse
events annotations
4. We classify reports of
adverse events according to
established guidelines
Clinical
standard
Logical
encoding
Classiﬁed
dataset
Classiﬁed
Reports

Existing standard: the Brighton
collaboration
• https://brightoncollaboration.org
• Provides case definitions and guidelines to standardize
reporting
• Well established network (adopted as standard in Canada
2009)
• Benefits of working with Brighton:
• Existing software tool
• Extensive network of collaborators, shared vision
Clinical
standard

Strategy for encoding adverse event
reports
• Model the domain using an ontology
• Ontologies typically have two distinct components:
• Names for important concepts in the domain
• Prokaryotic cells
• Eukaryotic cells
• Background knowledge/constraints on the domain
• Nothing can be a prokaryotic and an eukaryotic
cell
Logical
encoding

Strategy for encoding adverse event
reports
• Ontology encoded using the Web Ontology
Language (OWL 2)
• Open Biological and Biomedical Ontology
Foundry helps with quality, interoperability and
avoiding redundant work
• More than >100 biomedical ontologies in the suite,
e.g., Gene Ontology (GO)
• Reuse of resources (ontologies and tools)
Logical
encoding

Reasoning is critical
• Prokaryotic and
Eukaryotic cell are
declared disjoints
• Fungal cell is a
Eukaryotic cell
• Spore is a Fungal cell
and a Prokaryotic cell
=> inconsistency
doi:10.1371/journal.pone.0022006.g003

Clinical guideline in AERO
• Goal: provide a pattern to encode adverse event following
immunization guidelines
• This pattern should be applicable to any type of clinical
guideline
• Enable the reports to be annotated with diagnosis
according to a specific guideline (and keep track of
what it is)
• We want to:
• Encode the guideline in OWL
• Be able to infer correct classification (i.e., perform accurate
diagnosis)
Logical
encoding

Current status
• Pattern implemented in the OWL file for
anaphylaxis
• Has been successfully used to
model the WHO malaria clinical guidelines
• Paper submitted (yay )
• Need to add other guidelines
Jie Zheng
Upenn
Logical
encoding

VAERS dataset
• VAERS = Vaccine Adverse Event Reporting System
• Depends on the Centers for Disease Control and
Prevention (CDC) and the Food and Drug Administration
(FDA) in the United States
• Spontaneous reporting system
• Issues with underreporting, quality of reporting
• Uses MedDRA annotations (Medical Dictionary of
regulatory Activities)
Classiﬁed
dataset

Example
VAERS report
Classiﬁed
dataset

Classified VAERS data
• Unclassified files available publicly
• Classified dataset available upon request (in this case
H1N1 dataset)
• Cleanup
• No default NULL value: “none”, “null”, “”…
• Multiple languages: encoding issue with Spanish
• 5 MedDRA terms per report, or duplicates
• Pre-processing required
• Load into database
• Match to public records
Classiﬁed
dataset

Classification using MedDRA annotations
• Goal is to map the current Brighton terms in
AERO to their MedDRA counterpart
• Then try and classify the MedDRA-annotated
reports using the Brighton criteria
• Compare that with classification done by medical
experts
Classiﬁed
Reports

Mapping to MedDRA
• Translate, as best possible, MedDRA annotations
to Brighton symptoms
• Import selected MedDRA terms in to OWL, following
general strategy of Minimal Information to Reference an
External Ontology Terms (Courtot, et al. 2011)
• Standardized MedDRA Queries provide useful
documentation on how to interpret MedDRA
• OWL used to define Brighton symptoms in terms of
MedDRA terms (this will be only approximate)
Classiﬁed
Reports

Classification using text
• In collaboration with Seeker Solutions, a Victoria based
company
• Goal is to use text part of the reports to classify them
• Process:
• Training data: a set of reports that have been manually classified
• Machine learning algorithm learns pattern leading to correct
classification
• The model is applied to new testing data
• 2 types of classification tested:
• Likelihood
• Topic modeling
Classiﬁed
Reports

Likelihood ordering
Classiﬁed
Reports

Topic modeling
Classiﬁed
Reports

Current status
• Testing classification
with the MedDRA terms
• Need to work on the
MedDRA mapping
• Test classification with
AERO (and compare
with the one with
MedDRA)
• Refine text classification
• Using the ontology to guide
clustering
• Using Canadian dataset

The semantic web
• From a web of documents to a web of data
• HTML pages can’t be understood by machines; humans
have to manually follow hyperlinks
• Semantic web uses standard for data representation,
querying, vocabularies to link data behind the scenes
• Use of Uniform Resources Identifiers (URIs) and
Resource Description Framework (RDF)

RDF and URIs
• RDF: a language used to represent information about
resources on the web
• RDF statement: subject, predicate, object
• URI: unique identifiers for things
• http://purl.obolibrary.org/obo/AERO_0000244: major dermatological
criterion for anaphylaxis according to Brighton
generalized
rash
major
dermatological
criterion for
anaphylaxis
is_a

Linked Open Data cloud
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

VAERS as linked data
• Transform the VAERS dataset in RDF to enable better
integration with existing resources
• No need to worry about resources’ structure (CSV,
databases, XML)
• Each report is an instance of a VAERS report
• System will also provide technical infrastructure to test
classification
• RDF automatically generated from the database
containing VAERS data

VAERS as linked data
Report 117893

Querying across linked data
• URIs(or mappings between URIs) to link different
resources
• Querying on the VAERS dataset
• E.g., are there difference in the type of adverse events between a live
attenuated flu vaccine and a trivalent inactivated one?
• Querying across multiple datasets
• Identify drugs in text (e.g. Benadryl) and infer they are anti-allergic
agents via DrugBank

Example: link state code in VAERS to state
info in DBPedia, pass result to Google
visualization API

Acknowledgements
• Alan Ruttenberg, Ryan Brinkman
• Oliver He, Yu Lin, Lindsay Cowell, Barry Smith, Ryan
Brinkman, Peter d’Eustachio, Albert Goldfain
• Julie Lafleche, Lauren McDonald, Robert Pless,
Barbara Law, Jan Bonhoeffer, Jean-Paul Collet
• Brinkman lab

Enabling faster analysis of vaccine adverse event reports with ontology support

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (18)

Similar a Enabling faster analysis of vaccine adverse event reports with ontology support

Similar a Enabling faster analysis of vaccine adverse event reports with ontology support (20)

Más de Melanie Courtot

Más de Melanie Courtot (16)

Último

Último (20)

Enabling faster analysis of vaccine adverse event reports with ontology support