We were interested in whether we could model well-established clinical risk guidelines in OWL, and use these to automatically classify patient data v.v. "risk" (e.g. using the Framingham risk categories). What we ended-up doing, however, was wandering down a very interesting path of attempting to model clinical intuition! This reports the first phase of the experiment. A subsequent SlideShare will give part II of this investigation.
This is the work of Soroush Samadian, Ph.D. Candidate at the University of British Columbia Bioinformatics Graduate Programme.
Enhancing Reproducibility and Transparency in Clinical Research through Semantic Technologies
1. Life Web Science 2013, Paris
Improving Transparency and Reproducibility
of Clinical Research
Using Semantic Technologies
Soroush Samadian & Mark Wilkinson
Isaac Peral Senior Researcher in Biological Informatics
Centro de Biotecnología y Genómica de Plantas, UPM, Madrid, Spain
Adjunct Professor of Medical Genetics, University of British Columbia
Vancouver, BC, Canada.
2. Can we make the Web a
scientific research platform
from hypothesis right through to publication
3. Focus on publishing citable snippets
of scientific knowledge using SemWeb standards
4. That’s a good start v.v. academic publishing, but...
What about the rest of the scientific process?
7. Context
Multiple recent surveys of high-throughput biology
reveal that upwards of 50% of published studies
are not reproducible
- Baggerly, 2009
- Ioannidis, 2009
8. Context
Similar (if not worse!) in clinical studies
- Begley & Ellis, Nature, 2012
- Booth, Forbes, 2012
- Huang & Gottardo, Briefings in Bioinformatics, 2012
9. Context
“the most common errors are simple,
the most simple errors are common”
At least partially because the
analytical methodology was inappropriate
and/or not sufficiently described
- Baggerly, 2009
10. Context
These errors pass peer review
The researcher is unaware of the error
The process that led to the error is not recorded
Therefore it cannot be detected during peer-review
11. Context
Discovery of such errors have resulted in retractions
and even shut-down clinical trials
- Ioannidis, 2009
13. Context
Institute of Medicine Recommendations
For Conduct of High-Throughput Research:
Evolution of Translational Omics Lessons Learned and the Path Forward. The
Institute of Medicine of the National Academies, Report Brief, March 2012.
1. Rigorously-described, -annotated, and -followed data
management and manipulation procedures
2. “Lock down” the computational analysis pipeline once it
has been selected
3. Publish the analytical workflow in a formal manner,
together with the full starting and result datasets
20. QUDT: Quantities, Units, Dimensions and Types Ontology
conversion offset & conversion multiplier
enable conversion between non-SI-based unit and its SI equivalent.
21. OM: Ontology of Units of Measurement
Useful for inventing new units that are commonly used in clinical research
but not (currently) in any Unit ontology
22. ID HEIGHT WEIGHT SBP CHOL HDL BMI
GR
SBP
GR
CHOL
GR
HDL
GR
pt1 1.82 177 128 227 55 0 0 1 0
pt2 179 196 13.4 5.9 1.7 1 0 1 0
Legacy clinical dataset
used in our studies
Height in m and cm Chol in mmol/l and mg/l
...and other delicious weirdness
Expert decision on “risk”
(e.g. BMI=1 means “at health-risk with this BMI)
23. GOAL: get the clinical researcher
“out of the loop” once the data is collected
Complete transparency in analysis with
NO PEEKING & NO TWEAKING!
(see U.S. IOM Recommendations)
24. Extending the GALEN ontology with richer logic
including measurement values and units
measure:SystolicBloodPressure =
galen:SystolicBloodPressure and
("sio:has measurement value" some "sio:measurement" and
("sio:has unit" some “om: unit of measure”) and
(“om:dimension” value “om:pressure or stress dimension”) and
"sio:has value" some rdfs:Literal))
Very general definition
“some kind of pressure unit”
25. Now Galen classes can be used to “carry” rich data
Move beyond use of ontologies for simple keyword curation
(keyword hierarchies are SO last-decade!)
26. Now, what do we do with the unit-soup that is in our legacy dataset?
27. SADI Semantic Web Service for automated Unit conversion
• Send it a dataset with mixed units
• (optional) tell it the harmonized unit you want back
• Returns you a dataset with harmonized units
Automatic semantic detection of the “nature”
of the incoming unit type (e.g. “unit of pressure”)
Automatic conversion based on dimensionality and/or offset & multiplier
28. Create additional ontological classes
representing clinical features of interest based on clinical guidelines
measure:HighRiskSystolicBloodPressure
measure:SystolicBloodPressure and
sio:hasMeasurement some
(sio:Measurement and
(“sio:has unit” value om:kilopascal) and
(sio:hasValue some double[>= "18.7"^^double])))
Remember that this is from
our extension of Galen
Extend, Reuse, Recycle!
Now we’re being specific
MUST be in kpascal and must be > 18.7
29. SELECT ?record ?convertedvalue ?convertedunit
FROM <./patient.rdf>
WHERE {
?record rdf:type measure:HighSystolicBloodPressure .
?record sio:hasMeasurement ?measurement.
?measurement sio:hasValue ?convertedvalue.
?record cardio:ExpertClassification ?riskgrade .
}
RecordID Start Val Start Unit End Val End Unit
cm_hg1 15 cmHg 19.998 KiloPascal
cm_hg2 14.6 cmHg 19.465 KiloPascal
mm_hg1 14.8 mmHg 19.731 KiloPascal
mm_hg2 146 mmHg 19.465 KiloPascal
SHARE query
(SHARE is a SADI-enhanced SPARQL query engine)
Because the OWL definition of HighSBP
required kpascal, SHARE used SADI to
auto-convert everything into kpascal
31. Framingham risk measurements:
Age
Gender
Height
Weight
Body Mass Index(BMI)
Systolic Blood Pressure(SBP)
Diastolic Blood Pressure(DBP)
Glucose
Cholesterol
Low Density Lipoprotein (LDL)
High Density Lipoprotein (HDL)
Triglyceride(TG)
All modeled
as OWL Classes
much the same
as described before
32. Measurements like BMI are derived from calculations
over other “core” measurements
Again, we use SADI and semantics to achieve this automatically
(and of course, any unit conflicts in the input data are also automatically
detected and resolved by the previous SADI service we discussed)
33. Semantic Modeling of the
American Heart Association Risk grades
HighRiskBMI =
PatientRecord and
(sio:hasAttribute some
(cardio:BodyMassIndex and sio:hasMeasurement some
(sio:Measurement and
(sio:hasUnit value cardio:kilogram-per-meter-squared) and
(sio:hasValue some double[>= 25.0]))))
Limit taken directly from clinical guidelines
Similarly for SBP, Cholesterol, etc....
34. SHARE Query for High Risk SBP
(SHARE is a SADI-enhanced SPARQL query engine)
38. But... We encoded and were following the guidelines!
We double-checked and our definitions were definitely correct
How could we possibly be wrong??
39. Visual inspection of data and guidelines showed
in many cases the clinician had “tweaked” the guideline
------------------
AHA BMI risk threshold: BMI=25
In our dataset the clinical researcher used BMI=26
------------------
HDL “official” guideline HDL=1.03mmol/l
The dataset from our researcher: HDL=0.89mmol/l
-------------------
40. Adjusting our OWL class definitions and re-running the analysis
Resulted in nearly 100% correspondence with the clinical researcher
(at least for binary risk assessment on simple measurements)
HighRiskCholesterolRecord=
PatientRecord and
(sio:hasAttribute some
(cardio:SerumCholesterolConcentration and
sio:hasMeasurement some ( sio:Measurement and
(sio:hasUnit value cardio:mili-mole-per-liter) and
(sio:hasValue some double[>= 5.0]))))
HighRiskCholesterolRecord=
PatientRecord and
(sio:hasAttribute some
(cardio:SerumCholesterolConcentration and
sio:hasMeasurement some ( sio:Measurement and
(sio:hasUnit value cardio:mili-mole-per-liter) and
(sio:hasValue some double[>= 5.2]))))
41. Reflect on this for a second... Because this is important!
1. We automated data cleansing and analysis using Semantic Web Services
2. We encoded clinical guidelines in OWL (first time this has been done AFAIK)
3. We found that clinical researchers did not follow the official guidelines
• This is fine! They’re the experts! But...
4. Their “personalization” of the guidelines was unreported
5. Nevertheless, we were able to create “personalized” OWL Classes
representing the viewpoint of that clinical researcher
6. These personalized viewpoints, in OWL, were published on the Web
7. These published, personalized OWL classes can be automatically re-used
by others to interpret their own data using that clinician’s viewpoint
42. AHA:HighRiskCholesterolRecord
PatientRecord and
(sio:hasAttribute some
(cardio:SerumCholesterolConcentration and
sio:hasMeasurement some ( sio:Measurement and
(sio:hasUnit value cardio:mili-mole-per-liter) and
(sio:hasValue some double[>= 5.0]))))
McManus:HighRiskCholesterolRecord
PatientRecord and
(sio:hasAttribute some
(cardio:SerumCholesterolConcentration and
sio:hasMeasurement some ( sio:Measurement and
(sio:hasUnit value cardio:mili-mole-per-liter) and
(sio:hasValue some double[>= 5.2]))))
PREFIX AHA=http://americanheart.org/measurements/
PREFIX McManus=http://stpaulshospital.org/researchers/mcmanus/
43. To do the “experiment” using AHL guidelines
SELECT ?patient ?risk
WHERE {
?patient rdf:type AHL: HighRiskCholesterolRecord .
?patient ex:hasCholesterolProfile ?risk
}
44. To do the “experiment” using McManus guidelines
SELECT ?patient ?risk
WHERE {
?patient rdf:type McManus:HighRiskCholesterolRecord .
?patient ex:hasCholesterolProfile ?risk
}
46. CAN WE INTERPRET COMPLEX
CLINICAL PHENOTYPES?
Problem #3: Moving Beyond Simple Binary Risk
47. The next step was to attempt to model the
Framingham Risk Scores
e.g. 10-year Cardiovascular Disease Risk
This takes a large number of variables
(SBP, BMI, and disease states such as diabetes)
and calculates a patient’s risk
48. How do we do with these non-trivial cases?
...not very well LOL!
49. OWL Modeling of Framingham 10-year risk for general CVD
UGH! Awful!
50. Discussions with the clinical researcher revealed the problem...
The patients were on drugs that affected their clinical measurements
(effectively, the drugs made them more “normal”)
however the expert continued to classify them as having the clinical problem
based on their implicit knowledge
regardless of the clinical measurement value
51. Can we compensate for that level of expert intuition?
We believe so
and the required knowledge is already encoded for us!
53. The resource to automate interpretation
of a patient’s prescriptions exists
and would allow us to (more) properly
interpret their phenotype
IF
We could accurately get this information
out of their clinical record
54. Patient1 Patient2
DRUG 1 ASPRIN* ASCRIPTIN
DRUG 1 DOSAGE 1 DLY 1DLY,10MG AS NEEDED
DRUG 2 PROCARDIA PERSANTINE
DRUG 2DOSAGE 10MG 1 3X DLY 75MG TID
DRUG 3 BUFFERIN Lopid
DRUG 3 DOSAGE 1DLY 4X300MG DLY
DRUG 4 VASOTEC DICUMAROL
DRUG 4 DOSAGE 2 DLY
DRUG 5 XSD ASCRIPTIN TRANRENE
DRUG 5 DOSAGE
DRUG 6 DIPYRIDAMOLE
100MG
PERSANTINE
DRUG 6 DOSAGE 1 75MG, 3X DLY
DRUG 7 VASOTEC
DRUG 7 DOSAGE
Treated for HBP 1 1
Treated for Diabetes 1 1
Treated for High
Cholesterol
0 1
55. Patient1 Patient2
DRUG 1 ASPRIN* ASCRIPTIN
DRUG 1 DOSAGE 1 DLY 1DLY,10MG AS NEEDED
DRUG 2 PROCARDIA PERSANTINE
DRUG 2DOSAGE 10MG 1 3X DLY 75MG TID
DRUG 3 BUFFERIN Lopid
DRUG 3 DOSAGE 1DLY 4X300MG DLY
DRUG 4 VASOTEC DICUMAROL
DRUG 4 DOSAGE 2 DLY
DRUG 5 XSD ASCRIPTIN TRANRENE
DRUG 5 DOSAGE
DRUG 6 DIPYRIDAMOLE
100MG
PERSANTINE
DRUG 6 DOSAGE 1 75MG, 3X DLY
DRUG 7 VASOTEC
DRUG 7 DOSAGE
Treated for HBP 1 1
Treated for Diabetes 1 1
Treated for High
Cholesterol
0 1
56. RxNav and ChemSpider have APIs
for canonicalization of drug names
Use SADI service workflow to migrate legacy data
into canonicalized form
57. This workflow has a ~4% failure rate
(my small trials with Google Suggest looked promising
at improving this!)
58. UnderHypertensionTreatment=
galen:Patient and
cardio:isPrescribed some
(cardio:CanonicalizedDrugCollection and
sio:has_member some
(cardio:HypertensionTreatmentMedication))
cardio:HypertensionTreatmentMedication=
cardio:CanonicalDrugRecord and ( ndf:may_treat some ndf:Hypertension )
Adding prescription information into our OWL
Framingham Risk models
Note how easy it is to connect semantic data into your system -
Just refer to it in your definition!! Also note that we’re not listing a bunch of drugs,
we’re including any drug defined as a Hypertension treatment by NDF-RT.
The Semantic Definition, not an explicit drug list!
59. Now how are we doing?
The answer is a bit surprising...
60. Patient ID Automatic Risk Grade
(based on drugs
prescribed)
Expert-assigned Grade
(BP_TREATMENT_STA
TUS)
Uri4627 1 1
Uri4275 1 1
Uri822 1 0
Uri893 1 1
For “treated for cholesterol” and “treated for diabetes”
we achieved detection specificities of 96->99%
But for blood pressure it was a bit more of a mess...
Only 44% specificity! What went wrong?
61. A large number of drugs are used to treat blood pressure
but these drugs can also used to treat other things
From the perspective of the treating clinician
the purpose for which they prescribed the drug
is the purpose that they record in the chart
62. But the drug has other effects
that the clinical researcher
might not (does not) account for
in their expert evaluation
How do we define “correct” in this scenario?
????
(i.e. Is this a bug, or a feature??)
63. Accuracy Precision Recall
High Risk 0.82 0.89 0.71 0.84 0.30 0.61
Moderate
Risk
0.68 0.73 0.68 0.72 0.71 0.74
Low Risk 0.76 0.83 0.55 0.65 0.80 0.81
Overall
Our ability to classify raw clinical data
(with spelling mistakes and all)
into the Framingham Risk evaluation
compared to the expert clinical assessment
White = before including prescription information
Grey = including the NDF-RT drug knowledgebase
64. We’re looking for other “intuitive” decisions
made by the clinician
that will account for our remaining inaccuracies
We are optimistic that we can record
at least some of these in OWL and/or
as features within the SPARQL query
Remember – the objective is transparency
not necessarily 100% semantic encoding
65. Interestingly, we were also able to create a simple OWL Class
that allowed classification of patients based on being
prescribed contra-indicated drugs
~4.2% of patients were taking dangerous drug combinations
We “got this for free” by connecting
a bunch of semantic resources together!
66. Take-home messages - repeated
1. We automated analysis using Semantic Web Services
2. We encoded clinical guidelines in OWL
3. We found that clinical researchers did not follow the official
guidelines
• This is fine! They’re the experts! But...
4. Their “personalization” of the guidelines was unreported
5. We were able to create “personalized” OWL Classes
representing the viewpoint of that clinical researcher
6. These personalized viewpoints were published on the Web
7. The OWL classes can be automatically re-used by others
68. The OWL Classes we constructed
represent a particular clinician’s view of “reality”
By my definition, that IS a hypothesis!
Other work in our lab has demonstrated* that we can
duplicate an entire published research paper
“simply” by creating an OWL class representing
the hypothetical view of that researcher
(and note that these hypotheses are explicit, shared on the Web,
and re-usable by others!!)
* Wood et al, Proc. ISoLA, 2012