SlideShare a Scribd company logo
1 of 26
Validation of a Natural Language
Processing Protocol for Detecting
     Heart Failure Signs and
 Symptoms in Electronic Health
       Record Text Notes
      Roy J. Byrd2, Steven R. Steinhubl1, Jimeng
   Sun2, Shahram Ebadollahi2, Zahra Daar1, Walter F.
                       Stewart1
        1Geisinger Medical Center, Center for Health
                   Research, Danville, PA
    2 IBM, T.J. Watson Research Center, Hawthorne, NY
Outline
•   Background and objectives
•   Datasets
•   Tools & Methods
•   Results
•   Discussion
    – Challenges
    – Opportunities
• Summary


• (Iterative annotation refinement)
Background and Objectives
• Background
   – Framingham criteria for HF published in 1971
   – Geisinger/IBM “PredMED” project on predictive modeling for
     early detection of HF, using longitudinal EHRs


• Overall Project Objective
    Better understand the presentation of HF in the primary care
     setting, in order to facilitate its more rapid identification and
                                  treatment


• Objective of this paper:
     Build and validate NLP extractors for Framingham criteria
     (signs and symptoms) from EHR clinical notes, so that they
       may be suitable for downstream diagnostic applications
Framingham HF Diagnostic Criteria
           MAJOR SYMPTOMS                        MINOR SYMPTOMS
1. Paroxysmal Nocturnal Dyspnea         1. Bilateral Ankle Edema
    (PND) or Orthopnea
2. Neck Vein Distension (JVD)           2. Nocturnal Cough
3. Rales                                3. Dyspnea on ordinary exertion
4. Radiographic Cardiomegaly            4. Hepatomegaly
5. Acute Pulmonary Edema                5. Pleural effusion
                                        6. A decrease in vital capacity by 1/3
6. S3 Gallop
                                            of the maximal value recorded**
7. Increased Central Venous Pressure    7. Tachycardia (>120 BPM)
     (> 16 cm H2O at RA)
8. Circulation Time of 25 seconds**
9. Hepatojugular Reflux (HJR)           ** Not extracted, since these criteria
                                           are not documented in routine
10.Weight loss 4.5kg in 5 days in          clinical practice.
   response to treatment
                       N Engl J Med. 1971;285:1441-1446.
(Sample downstream analysis)

                          Reports of Framingham HF criteria
                            in the year prior to diagnosis
Percent with Documented Criteria




                                   60

                                   50      Cases (N=4,644)                  Controls (N=45,981)

                                   40
                                                                                                     62.3          65
                                   30

                                   20
                                                                                                            28.6
                                                                                                                        22.9
                                   10   17.2         17.9                               17.7

                                               7.2          5.8   5.2 1.7     1.4 0.7          1.1
                                    0
                                        PND          Rales         JVD        Pulm CMegaly Ankle                   DOE
                                                                             Edema         Edema
Datasets
• Clinical notes from longitudinal (2001-2010) EHR
  encounters for
   – 6,355 case patients
       • Meet operational criteria for HF**
   – 26,052 control patients
       • Clinic-, gender- and age-matched to cases
   – The case-control distinction is exploited in downstream
     applications; it’s not relevant for criteria extraction.
• Development dataset                                  **Operational HF Criteria
   – 65 encounter notes                                    –HF diagnosis on
       • Selected for density of Framingham criteria        problem list,
       • Annotated by a clinical expert                    –HF diagnosis in EHR
                                                            for two outpatient
• Validation dataset                                        encounters,
                                                           –Two or more
   – 400 encounter notes (200 cases & 200 controls)         medications with ICD-
       • Randomly selected                                  9 code for HF, or
       • Annotated by consensus of 4 trained coders        –One HF diagnosis and
                                                            one medication with
       • N = 1492 criteria                                  ICD-9 code for HF
Tools


      • LRW1 – LanguageWare Resource Workbench
                     UIMA Collection Processing Engine
          – Basic Text Processing
Encounter – Dictionaries for
                Basic Processing      Dictionaries and Grammars                  Text Analysis Engines
                                                                                                                   Extracted
               paragraphs, sentences,   for recognizing criteria                for applying constraints
Documents                                                                                                           Criteria
          – Grammars etc.
                  tokenization,               candidates                         and annotating criteria


      • UIMA2 - Unstructured Information Management
        Architecture
            – Execution Pipeline, including I/O management
            – Text Analysis Engines
      • TextSTAT3 – Simple Text Analysis Tool
            – Concordance program, used for linguistic analysis

   1http://www/alphaworks.ibm.com/tech/lrw   2http://uima.apache.org   3http://neon.niederlandistik.fu-berlin.de/en/textstat
Criteria Extraction Methods:
               Dictionaries
• Framingham Criteria              • Negating words
  vocabulary                          – Used to deny criteria
   – Words and phrases used to            • no, free of, ruled out
     mention the 15
     Framingham Criteria
                                   • Counterfactual triggers
                                      – The criteria may not have
   – edema, leg                         occurred
     edema, oedema; shortness
     of breath, SOB                       • if, should, as needed for
   – Size: ~75 “lemma forms”       • Miscellaneous Classes
     (main entries) and               – Weight loss phrases
     hundreds of variant forms            • lose weight, diurese
• Segment Header words                – Time value words
  and phrases                             • day, week, month
   – Patient                          – Weight units
     History, Examination, Plan,          • pound, kilogram
     Instruction                      – Diuretics
                                          • Bumex, Furosimide
Criteria Extraction Methods:
               Grammars
• Shallow English syntax            • Negated Scope
   – Noun Phrases                      – regular rate and rhythm
      • some moderate DOE                without
   – Compound Noun Phrases               murmurs, clicks, gallops, o
                                         r rubs
      • chest pain, DOE, or night
        cough                       • Counterfactual Scope
   – Prepositional Phrases             – Patient should call if she
• No full-sentential parses              experiences shortness of
                                         breath
   – Not needed for simple HF
     criteria                       • Weight Loss
   – Unreliable sentence               – 20 pound weight loss in a
     boundaries and syntax in            week with diuretics
     clinical notes                 • Tachycardia
                                       – tachy at 120 (to 130)
                                       – HR: 135
Criteria Extraction Methods:
     Text Analysis Engines (TAEs)
• Rules to filter candidate       • Co-occurrence
  criteria created from             constraints
  dictionaries and                   – exercise HR: 135 doesn’t
  grammars.                            affirm Tachycardia
• Deny criteria mentioned         • Disambiguation
  in negated contexts                – edema is recognized as
   – regular rate and rhythm           APEdema, if near cxr, or in
     without murmurs, clicks,          a “Radiology” note, or in a
     gallops, or rubs  S3Neg          “Chest X-Ray” segment
• Ignore criteria in              • Numeric constraints
  counterfactual contexts            – she lost 5 pounds over a
                                       month doesn’t affirm
   – Patient should call if she        WeightLoss
     experiences shortness of
     breath                          – tachy @ 115 doesn’t affirm
                                       Tachycardia
Encounter Labeling Methods
• We can label an encounter note with labels showing the
  criteria that the note mentions
   – The labels can be used by downstream analyses to gather
     information such as: “This patient exhibited those symptoms on
     that date.”
• 2 Methods:
   – Machine-learning
      • Using candidate criteria and scope annotations, as features, …
      • use a [CHAID decision tree] classifier to assign criteria as labels.
   – Rule-based
      • Run the full extractor pipeline, then …
      • Assign labels consisting of all unique criteria that survive filtering.
Results
Evaluation Flow




Metrics:                                        Machine    Encounter
                                                Learning    Labels
 Precision (Positive Predictive Value):
                    Lexical       Lexical                              Encounter
   Encounter
  #TruePositive / (#TruePositive &+Scope
                   Look-up          #FalsePositive)                      Label
  Documents
                  & Scope     Annotations                              Evaluation
 Recall (Sensitivity):                                     Encounter
                                                 Rules
  #TruePositive / (#TruePositive + #FalseNegative)          Labels

 F-Score (the harmonic mean of Precision and Recall):
  (2 x Precision x Recall) / (Precision + Recall)                       Criteria
Encounter Labeling Performance

                   Machine-learning method                      Rule-based method

               Recall    Precision     F-Score        Recall      Precision     F-Score


 Affirmed     0.675000   0.754190      0.712401      0.738532    0.899441       0.811083


  Denied      0.945556   0.905319      0.925000      0.987599    0.931915       0.958949


  Overall     0.896364   0.881144      0.888689      0.938462    0.926720       0.932554

Overall 99%
                                     (0.848-0.929)                            (0.900-0.964)
Conf. Int.



    Conclusion: Machine-learning labeling does not significantly underperform
    rule-based labeling.
Performance of Framingham
        Diagnostic Criteria Extraction
                                                            99% Confidence
                          Precision   Recall     F-score
                                                           Interval (F-score)

        Overall (exact)   0.925234 0.896864 0.910828        (0.891 - 0.929)

       Overall (relaxed) 0.948239 0.919164 0.933475         (0.916 - 0.950)

          Affirmed        0.747801 0.789474 0.768072        (0.711 - 0.824)

           Denied         0.982857 0.928058 0.954672        (0.938 - 0.970)


Note: Performance on affirmed criteria is worse, possibly because of their
greater syntactic diversity. For example, we don’t find:
         PleuralEffusion: blunting of the right costrophrenic angle
         DOExertion: she felt like she couldn’t get enough air in
Precision and Recall for Individual
             Criteria
Analysis of 1492 extracted criteria:
             PredMED extractions vs.
            Gold Standard annotations




                                                                                                                                                     e
                                                                                                                                                 tiv
                    ED eg
                    KE td




                                                                                                                                              si
                    E g




                                                                                                                                     TL g
                                                                                                  g
                 AP DN




                  EP g
                 D Ne




                                                                                                                                   W Ne
                                                                                           R eg




                                                                                                                                           Po
                 H eg




                                                                                           R Ne




                                                                                                                                   TA eg
                                                            JV e g



                                                                     N eg




                                                                                           PN eg
                 AN dS

                 AN D




                       e




                                                                                   PL g




                                                                                                                           S3 g
                    EN




                                                                                                N
                     N
                    KE



                    ED




                                                                                       e




                                                                                                                               e



                                                                                                                                         N

                                                                                                                                         H

                                                                                                                                         H
                                                                N




                                                                                               E

                                                                                               E
                                                                        N




                                                                                              EN




                                                                                                                                         e
                                                                                     N




                                                                                                                             N
                                                                                              D

                                                                                              D
             ol




                  EP




                                                                                                                             G

                                                                                                                                      G

                                                                                                                                      C

                                                                                                                                      C
                                                                                            AL

                                                                                            AL
                  JR

                                                             JR




                                                                                     E
                                                              D

                                                                       D




                                                                                                                                      ls
                  O

                  O




                                                                      C

                                                                               C




                                                                                             C

                                                                                                                        C
PredMED




                                                                                           PN
                 AP




                                                                                                                                   TA
                                                                                           PL




                                                                                                                                   S3
                                                                     JV




                                                                                                                                   Fa
            G




                 D

                 H

                 H



                                                            H




                                                                              N




                                                                                           R



                                                                                                                       R
ANKED            90     6                                                                                                                           16
ANKEDNeg              230                                                                                                                            6
APED              8         5                                                        2                                                          1   22
APEDNeg                         0
DOE                                 116 17                                                         1                                                3
DOENeg                                3 135                                                        2                                                1
HEP                                           0     1
HEPNeg                                            125
HJR                                                     2   1
HJRNeg                                                      9
JVD                                                             7     2
JVDNeg                                                               91
NC                                                                        2
NCNeg                                                                         43                                                                    2
PLE                                                                                  8
PLENeg                                                                                     1
PND                                   1                                                        7    2
PNDNeg                                                                                             69
RALE                                                                                                    11                                          1
RALENeg                                                                                                      197
RC                                                                                                                 6
RCNeg                                                                                                                  1
S3G                                                                                                                          0
S3GNeg                                                                                                                           131
TACH                                                                                                                                    1           2
TACHNeg                                                                                                                                     0       4
WTL                                                                                                                                             0
False Negative    6    8    5   2     6   5   1    4    1            3               2         2   7         35    2   1     1     10
Discussion
• Challenges                           • Opportunities
   – Data quality: EHR text data is       – We can apply similar
     messy.                                 techniques to other collections
       • >10% (i.e., 26/237) of the         of criteria.
         errors are caused by                 • NY Heart Association
         misspellings & bad sentence          • European Society of
         boundaries                             Cardiology
   – Human anatomy                            • MedicalCriteria.com
       • We need a better solution        – Many specific criteria
         than word co-occurrence            extractors can be re-used in
         constraints
                                            other settings.
   – Syntactic diversity of affirmed
     criteria
       • We need deeper syntactic         – For downstream applications,
         and semantic analysis              see posters and presentations
   – Contradictions and                     from our project at this
     redundancy                             conference
       • An issue for downstream
         analysis
Summary
• Extractors can identify affirmations and denials
  of Framingham HF criteria in EHR clinical notes
  with an overall F-Score of 0.91.
• Classifiers can label EHR encounters with the
  Framingham critera they mention with an F-
  Score of 0.93.
• Information about HF criteria mentioned in EHR
  notes appears to be useful for downstream
  applications that seek to achieve early detection
  of HF.
Backup:
Iterative Annotation Refinement
Iterative Annotation Refinement
• What are the problems solved?
  – Annotations are required for training and evaluating
    criteria extractors.
  – Human annotators without guidelines have high
    precision but lower recall.
  – Domain experts’ intuitions (about the language for
    expressing criteria) are initially imprecise.
• What is produced?
  – Annotated dataset
  – Annotation guidelines         … that are consistent
  – Criteria extractors
The Development Process:
           Iterative Annotation Refinement
               Initialization   Results                  Iteration

                                                        Update the
                                  Expert
                 Write          Annotations             annotations
                 initial                                  and the
Expert         guidelines                                guidelines

   Discuss
      the                       Annotation    Annotate texts         Perform
  language        Encounter     Guidelines     with current           error
     of HF          Texts
                                                extractors           analysis
    criteria



                 Build
                                  Criteria              Update the
                 initial         Extractors             extractors
               extractors
Linguist
User interface for the annotation tool, which was
used to manage annotations during refinement.
Performance improvement during
         development
                                        Performance comparison
                                                                                  Final
                                         PredMED       Clinical Expert
              1                                                Ini al




             0.9
                                                                               Final



             0.8
 Precision




                         Ini al
             0.7



             0.6



             0.5
                   0.5            0.6         0.7               0.8      0.9              1
                                                    Recall
Iterative methods for creating
 annotations, guidelines, and extractors
                   Extraction       Result of using    Sources of       Arbiter for     Objective (and
                   target           the method         annotations      disagreements   metric) for each
                                                       compared in      at each         iteration
                                                       each iteration   iteration

Iterative          Framingham       - Annotations      Expert and       Expert          Improve extractor
Annotation         HF criteria      - Guidelines       Extractor                        performance (F-
Refinement                          - Extractor                                         score)

Annotation         Clinical         - Guidelines (in   Expert and       Consensus       Improve inter-
Induction          conditions       the form of an     Linguist                         annotator
(Chapman, et                        annotation                                          agreement (F-
al. J Biom Inf                      schema)                                             score)
2006)
CDKRM              Classes in the   - Annotations      2 Experts        Consensus       Improve inter-
(Coden, et al.,    cancer disease   - Guidelines                                        annotator
J Biom Inf         model                                                                agreement
2009)                                                                                   (agreement %)
TALLAL             PHI (protected   - Annotations      Expert and       Expert          Annotate full
(Carrell, et al,   health           - Extractor        Extractor                        dataset (to the
GHRI-IT            information)                                                         expert’s
poster, 2010)      classes                                                              satisfaction)

More Related Content

What's hot

Penyakit jantung rematik
Penyakit jantung rematikPenyakit jantung rematik
Penyakit jantung rematikReza Oktarama
 
Pendekatan Klinis Penurunan Kesadaran
Pendekatan Klinis Penurunan Kesadaran Pendekatan Klinis Penurunan Kesadaran
Pendekatan Klinis Penurunan Kesadaran Ade Wijaya
 
Frozen shoulder
Frozen shoulderFrozen shoulder
Frozen shoulderciputchan
 
Trauma Kapitis / Cedera Kepala Berat
Trauma Kapitis / Cedera Kepala BeratTrauma Kapitis / Cedera Kepala Berat
Trauma Kapitis / Cedera Kepala BeratAris Rahmanda
 
Laporan Kasus Bedah Anak : Hernia Inguinalis Lateralis Dekstra Reponibilis
Laporan Kasus Bedah Anak : Hernia Inguinalis Lateralis Dekstra ReponibilisLaporan Kasus Bedah Anak : Hernia Inguinalis Lateralis Dekstra Reponibilis
Laporan Kasus Bedah Anak : Hernia Inguinalis Lateralis Dekstra ReponibilisTenri Ashari Wanahari
 
Laporan kasus endokrin ulkus diabetikum
Laporan kasus endokrin ulkus diabetikumLaporan kasus endokrin ulkus diabetikum
Laporan kasus endokrin ulkus diabetikumkemal pratama
 
Definisi dan Jenis Skizofrenia
Definisi dan Jenis SkizofreniaDefinisi dan Jenis Skizofrenia
Definisi dan Jenis SkizofreniaSyscha Lumempouw
 
HINTS of Stroke, Bedside Eye Exam Outperforms MRI in Identifying Stroke
HINTS of Stroke, Bedside Eye Exam Outperforms MRI in Identifying StrokeHINTS of Stroke, Bedside Eye Exam Outperforms MRI in Identifying Stroke
HINTS of Stroke, Bedside Eye Exam Outperforms MRI in Identifying StrokeErsifa Fatimah
 
PCI (Percutaneous Coronary Intervention
PCI (Percutaneous Coronary InterventionPCI (Percutaneous Coronary Intervention
PCI (Percutaneous Coronary InterventionAmri Muliadi
 
PEMERIKSAAN PERKUSI JANTUNG PADA ANAK
PEMERIKSAAN PERKUSI JANTUNG PADA ANAKPEMERIKSAAN PERKUSI JANTUNG PADA ANAK
PEMERIKSAAN PERKUSI JANTUNG PADA ANAKSulistia Rini
 
Parese nervus fasialis
Parese nervus fasialisParese nervus fasialis
Parese nervus fasialisfikri asyura
 
Cerebral palsy case presentation
Cerebral palsy case presentation Cerebral palsy case presentation
Cerebral palsy case presentation drJaishreeRai
 

What's hot (20)

Penyakit jantung rematik
Penyakit jantung rematikPenyakit jantung rematik
Penyakit jantung rematik
 
Pendekatan Klinis Penurunan Kesadaran
Pendekatan Klinis Penurunan Kesadaran Pendekatan Klinis Penurunan Kesadaran
Pendekatan Klinis Penurunan Kesadaran
 
Frozen shoulder
Frozen shoulderFrozen shoulder
Frozen shoulder
 
Trauma Kapitis / Cedera Kepala Berat
Trauma Kapitis / Cedera Kepala BeratTrauma Kapitis / Cedera Kepala Berat
Trauma Kapitis / Cedera Kepala Berat
 
-Ppt-Stroke-Iskemik-A.pptx
-Ppt-Stroke-Iskemik-A.pptx-Ppt-Stroke-Iskemik-A.pptx
-Ppt-Stroke-Iskemik-A.pptx
 
Resusitasi cairan
Resusitasi cairanResusitasi cairan
Resusitasi cairan
 
Laporan Kasus Bedah Anak : Hernia Inguinalis Lateralis Dekstra Reponibilis
Laporan Kasus Bedah Anak : Hernia Inguinalis Lateralis Dekstra ReponibilisLaporan Kasus Bedah Anak : Hernia Inguinalis Lateralis Dekstra Reponibilis
Laporan Kasus Bedah Anak : Hernia Inguinalis Lateralis Dekstra Reponibilis
 
Laporan kasus endokrin ulkus diabetikum
Laporan kasus endokrin ulkus diabetikumLaporan kasus endokrin ulkus diabetikum
Laporan kasus endokrin ulkus diabetikum
 
Definisi dan Jenis Skizofrenia
Definisi dan Jenis SkizofreniaDefinisi dan Jenis Skizofrenia
Definisi dan Jenis Skizofrenia
 
Six minute walking test
Six minute walking testSix minute walking test
Six minute walking test
 
Syok pada anak
Syok pada anak Syok pada anak
Syok pada anak
 
HINTS of Stroke, Bedside Eye Exam Outperforms MRI in Identifying Stroke
HINTS of Stroke, Bedside Eye Exam Outperforms MRI in Identifying StrokeHINTS of Stroke, Bedside Eye Exam Outperforms MRI in Identifying Stroke
HINTS of Stroke, Bedside Eye Exam Outperforms MRI in Identifying Stroke
 
Syok Sepsis
Syok SepsisSyok Sepsis
Syok Sepsis
 
Bahan ekg
Bahan ekgBahan ekg
Bahan ekg
 
PCI (Percutaneous Coronary Intervention
PCI (Percutaneous Coronary InterventionPCI (Percutaneous Coronary Intervention
PCI (Percutaneous Coronary Intervention
 
PEMERIKSAAN PERKUSI JANTUNG PADA ANAK
PEMERIKSAAN PERKUSI JANTUNG PADA ANAKPEMERIKSAAN PERKUSI JANTUNG PADA ANAK
PEMERIKSAAN PERKUSI JANTUNG PADA ANAK
 
Parese nervus fasialis
Parese nervus fasialisParese nervus fasialis
Parese nervus fasialis
 
modul 2 urogenital
modul 2 urogenitalmodul 2 urogenital
modul 2 urogenital
 
Kejang demam ppt
Kejang demam pptKejang demam ppt
Kejang demam ppt
 
Cerebral palsy case presentation
Cerebral palsy case presentation Cerebral palsy case presentation
Cerebral palsy case presentation
 

Similar to Validation of a Natural Language Processing Protocol for Detecting Heart Failure Sins in Electronic Health Record Notes BYRD

Impact Of a Clinical Decision Support Tool on Asthma Patients with Current As...
Impact Of a Clinical Decision Support Tool on Asthma Patients with Current As...Impact Of a Clinical Decision Support Tool on Asthma Patients with Current As...
Impact Of a Clinical Decision Support Tool on Asthma Patients with Current As...Yiscah Bracha
 
EMR, EHR and Meaningful Use Presentation
EMR, EHR and Meaningful Use PresentationEMR, EHR and Meaningful Use Presentation
EMR, EHR and Meaningful Use Presentationcrashutah
 
Aug2015 zivana tezak analytical validation
Aug2015 zivana tezak analytical validationAug2015 zivana tezak analytical validation
Aug2015 zivana tezak analytical validationGenomeInABottle
 
Translating Clinical Guidelines into Knowledge-guided Decision Support
Translating Clinical Guidelines into Knowledge-guided Decision SupportTranslating Clinical Guidelines into Knowledge-guided Decision Support
Translating Clinical Guidelines into Knowledge-guided Decision SupportPlan de Calidad para el SNS
 
Clinical Tools - Faculty Development
Clinical Tools - Faculty DevelopmentClinical Tools - Faculty Development
Clinical Tools - Faculty DevelopmentRobin Featherstone
 
QA for IHC and ISH USE.pdf
QA for IHC and ISH USE.pdfQA for IHC and ISH USE.pdf
QA for IHC and ISH USE.pdfTrungTonNguyn1
 
[Hongsermeier] clinical decision support services amdis final
[Hongsermeier] clinical decision support services amdis final[Hongsermeier] clinical decision support services amdis final
[Hongsermeier] clinical decision support services amdis finalTrimed Media Group
 
AMP-Based Variant Classification with VSClinical
AMP-Based Variant Classification with VSClinicalAMP-Based Variant Classification with VSClinical
AMP-Based Variant Classification with VSClinicalGolden Helix
 
Underpinnings of the Interoperability Reference Architecture HISO 10040
Underpinnings of the Interoperability Reference Architecture HISO 10040Underpinnings of the Interoperability Reference Architecture HISO 10040
Underpinnings of the Interoperability Reference Architecture HISO 10040Health Informatics New Zealand
 
Underpinnings of the New Zealand Interoperability Reference Architecture
Underpinnings of the New Zealand Interoperability Reference ArchitectureUnderpinnings of the New Zealand Interoperability Reference Architecture
Underpinnings of the New Zealand Interoperability Reference ArchitectureKoray Atalag
 
ACMG-Based Variant Classification with VSClinical
ACMG-Based Variant Classification with VSClinicalACMG-Based Variant Classification with VSClinical
ACMG-Based Variant Classification with VSClinicalGolden Helix
 
Next-Generation Sequencing Analysis in VSClinical
Next-Generation Sequencing Analysis in VSClinicalNext-Generation Sequencing Analysis in VSClinical
Next-Generation Sequencing Analysis in VSClinicalGolden Helix
 
Early Cardiac Safety Data in Clinical Trials
Early Cardiac Safety Data in Clinical TrialsEarly Cardiac Safety Data in Clinical Trials
Early Cardiac Safety Data in Clinical TrialsOlivierSimon
 
Regenstrief New Gopher - Med Info 2013
Regenstrief New Gopher - Med Info 2013Regenstrief New Gopher - Med Info 2013
Regenstrief New Gopher - Med Info 2013Jon Duke, MD, MS
 
Identifying deficiencies in long-term condition management using electronic m...
Identifying deficiencies in long-term condition management using electronic m...Identifying deficiencies in long-term condition management using electronic m...
Identifying deficiencies in long-term condition management using electronic m...Health Informatics New Zealand
 
ENR AMIA Montreal 2012 V01 (2)
ENR AMIA Montreal 2012 V01 (2)ENR AMIA Montreal 2012 V01 (2)
ENR AMIA Montreal 2012 V01 (2)Nielsjans
 
2010 06 - LOINC-ICF
2010 06 - LOINC-ICF2010 06 - LOINC-ICF
2010 06 - LOINC-ICFdvreeman
 

Similar to Validation of a Natural Language Processing Protocol for Detecting Heart Failure Sins in Electronic Health Record Notes BYRD (20)

Impact Of a Clinical Decision Support Tool on Asthma Patients with Current As...
Impact Of a Clinical Decision Support Tool on Asthma Patients with Current As...Impact Of a Clinical Decision Support Tool on Asthma Patients with Current As...
Impact Of a Clinical Decision Support Tool on Asthma Patients with Current As...
 
EMR, EHR and Meaningful Use Presentation
EMR, EHR and Meaningful Use PresentationEMR, EHR and Meaningful Use Presentation
EMR, EHR and Meaningful Use Presentation
 
28 the ams solution ann hoffman - vitalea
28 the ams solution ann hoffman - vitalea28 the ams solution ann hoffman - vitalea
28 the ams solution ann hoffman - vitalea
 
Aug2015 zivana tezak analytical validation
Aug2015 zivana tezak analytical validationAug2015 zivana tezak analytical validation
Aug2015 zivana tezak analytical validation
 
Translating Clinical Guidelines into Knowledge-guided Decision Support
Translating Clinical Guidelines into Knowledge-guided Decision SupportTranslating Clinical Guidelines into Knowledge-guided Decision Support
Translating Clinical Guidelines into Knowledge-guided Decision Support
 
Clinical Tools - Faculty Development
Clinical Tools - Faculty DevelopmentClinical Tools - Faculty Development
Clinical Tools - Faculty Development
 
QA for IHC and ISH USE.pdf
QA for IHC and ISH USE.pdfQA for IHC and ISH USE.pdf
QA for IHC and ISH USE.pdf
 
[Hongsermeier] clinical decision support services amdis final
[Hongsermeier] clinical decision support services amdis final[Hongsermeier] clinical decision support services amdis final
[Hongsermeier] clinical decision support services amdis final
 
CDISC-CDASH
CDISC-CDASHCDISC-CDASH
CDISC-CDASH
 
AMP-Based Variant Classification with VSClinical
AMP-Based Variant Classification with VSClinicalAMP-Based Variant Classification with VSClinical
AMP-Based Variant Classification with VSClinical
 
Where do we currently stand at ICARDA?
Where do we currently stand at ICARDA?Where do we currently stand at ICARDA?
Where do we currently stand at ICARDA?
 
Underpinnings of the Interoperability Reference Architecture HISO 10040
Underpinnings of the Interoperability Reference Architecture HISO 10040Underpinnings of the Interoperability Reference Architecture HISO 10040
Underpinnings of the Interoperability Reference Architecture HISO 10040
 
Underpinnings of the New Zealand Interoperability Reference Architecture
Underpinnings of the New Zealand Interoperability Reference ArchitectureUnderpinnings of the New Zealand Interoperability Reference Architecture
Underpinnings of the New Zealand Interoperability Reference Architecture
 
ACMG-Based Variant Classification with VSClinical
ACMG-Based Variant Classification with VSClinicalACMG-Based Variant Classification with VSClinical
ACMG-Based Variant Classification with VSClinical
 
Next-Generation Sequencing Analysis in VSClinical
Next-Generation Sequencing Analysis in VSClinicalNext-Generation Sequencing Analysis in VSClinical
Next-Generation Sequencing Analysis in VSClinical
 
Early Cardiac Safety Data in Clinical Trials
Early Cardiac Safety Data in Clinical TrialsEarly Cardiac Safety Data in Clinical Trials
Early Cardiac Safety Data in Clinical Trials
 
Regenstrief New Gopher - Med Info 2013
Regenstrief New Gopher - Med Info 2013Regenstrief New Gopher - Med Info 2013
Regenstrief New Gopher - Med Info 2013
 
Identifying deficiencies in long-term condition management using electronic m...
Identifying deficiencies in long-term condition management using electronic m...Identifying deficiencies in long-term condition management using electronic m...
Identifying deficiencies in long-term condition management using electronic m...
 
ENR AMIA Montreal 2012 V01 (2)
ENR AMIA Montreal 2012 V01 (2)ENR AMIA Montreal 2012 V01 (2)
ENR AMIA Montreal 2012 V01 (2)
 
2010 06 - LOINC-ICF
2010 06 - LOINC-ICF2010 06 - LOINC-ICF
2010 06 - LOINC-ICF
 

More from HMO Research Network

New Rules Dealing with Conflicts of Interest in Public Health Service Funded ...
New Rules Dealing with Conflicts of Interest in Public Health Service Funded ...New Rules Dealing with Conflicts of Interest in Public Health Service Funded ...
New Rules Dealing with Conflicts of Interest in Public Health Service Funded ...HMO Research Network
 
Evaluation of the Validity of the Gestational Length Assumptions Based Upon A...
Evaluation of the Validity of the Gestational Length Assumptions Based Upon A...Evaluation of the Validity of the Gestational Length Assumptions Based Upon A...
Evaluation of the Validity of the Gestational Length Assumptions Based Upon A...HMO Research Network
 
Comparative Safety of Infliximaband Etanercept on the Risk of Serious Infecti...
Comparative Safety of Infliximaband Etanercept on the Risk of Serious Infecti...Comparative Safety of Infliximaband Etanercept on the Risk of Serious Infecti...
Comparative Safety of Infliximaband Etanercept on the Risk of Serious Infecti...HMO Research Network
 
A Multi State Markov Model for Analyzing Patterns of Use of Opiod Treatments ...
A Multi State Markov Model for Analyzing Patterns of Use of Opiod Treatments ...A Multi State Markov Model for Analyzing Patterns of Use of Opiod Treatments ...
A Multi State Markov Model for Analyzing Patterns of Use of Opiod Treatments ...HMO Research Network
 
A Descriptive Study of Vaccinations Occuring During Pregnancy HENNINGER
A Descriptive Study of Vaccinations Occuring During Pregnancy HENNINGERA Descriptive Study of Vaccinations Occuring During Pregnancy HENNINGER
A Descriptive Study of Vaccinations Occuring During Pregnancy HENNINGERHMO Research Network
 
The Use of Administrative Data and Natural Language Processing to Estimate th...
The Use of Administrative Data and Natural Language Processing to Estimate th...The Use of Administrative Data and Natural Language Processing to Estimate th...
The Use of Administrative Data and Natural Language Processing to Estimate th...HMO Research Network
 
Patient Views of KRAS Testing for Treatment of Metastatic Colorectal Cancer L...
Patient Views of KRAS Testing for Treatment of Metastatic Colorectal Cancer L...Patient Views of KRAS Testing for Treatment of Metastatic Colorectal Cancer L...
Patient Views of KRAS Testing for Treatment of Metastatic Colorectal Cancer L...HMO Research Network
 
Comparative Effectiveness of Chemotherapy Regimens for Advanced Lung Cancer C...
Comparative Effectiveness of Chemotherapy Regimens for Advanced Lung Cancer C...Comparative Effectiveness of Chemotherapy Regimens for Advanced Lung Cancer C...
Comparative Effectiveness of Chemotherapy Regimens for Advanced Lung Cancer C...HMO Research Network
 
CER HUB An Informatics Platform for Conducting Compartive Effectiveness with ...
CER HUB An Informatics Platform for Conducting Compartive Effectiveness with ...CER HUB An Informatics Platform for Conducting Compartive Effectiveness with ...
CER HUB An Informatics Platform for Conducting Compartive Effectiveness with ...HMO Research Network
 
An Application of Doubly Robust Estimation JOHNSON
An Application of Doubly Robust Estimation JOHNSONAn Application of Doubly Robust Estimation JOHNSON
An Application of Doubly Robust Estimation JOHNSONHMO Research Network
 
Risk Factors for Short Term Virologic Outcomes Among HIV Infected Patients Un...
Risk Factors for Short Term Virologic Outcomes Among HIV Infected Patients Un...Risk Factors for Short Term Virologic Outcomes Among HIV Infected Patients Un...
Risk Factors for Short Term Virologic Outcomes Among HIV Infected Patients Un...HMO Research Network
 
Expanding SEER Reporting with Comorbidity Data Colorectal Cancer HORNBROOK
Expanding SEER Reporting with Comorbidity Data Colorectal Cancer HORNBROOKExpanding SEER Reporting with Comorbidity Data Colorectal Cancer HORNBROOK
Expanding SEER Reporting with Comorbidity Data Colorectal Cancer HORNBROOKHMO Research Network
 
Drug Characteristics Associated with Medication Adherence Across Eight Diseas...
Drug Characteristics Associated with Medication Adherence Across Eight Diseas...Drug Characteristics Associated with Medication Adherence Across Eight Diseas...
Drug Characteristics Associated with Medication Adherence Across Eight Diseas...HMO Research Network
 
Feasibility of Implementing Screening Brief Intervention and Referral to Trea...
Feasibility of Implementing Screening Brief Intervention and Referral to Trea...Feasibility of Implementing Screening Brief Intervention and Referral to Trea...
Feasibility of Implementing Screening Brief Intervention and Referral to Trea...HMO Research Network
 
eCare for Heart Wellness A Trial to Test the Feasibility of Web Based Dietici...
eCare for Heart Wellness A Trial to Test the Feasibility of Web Based Dietici...eCare for Heart Wellness A Trial to Test the Feasibility of Web Based Dietici...
eCare for Heart Wellness A Trial to Test the Feasibility of Web Based Dietici...HMO Research Network
 
A Telephone Based Diabetes Prevention Program and Social Support for Weight L...
A Telephone Based Diabetes Prevention Program and Social Support for Weight L...A Telephone Based Diabetes Prevention Program and Social Support for Weight L...
A Telephone Based Diabetes Prevention Program and Social Support for Weight L...HMO Research Network
 
Technological Resources & Personnel Costs Required to Implement an Automated ...
Technological Resources & Personnel Costs Required to Implement an Automated ...Technological Resources & Personnel Costs Required to Implement an Automated ...
Technological Resources & Personnel Costs Required to Implement an Automated ...HMO Research Network
 
Online Patient Access to their Medical Record and Health Providers is Associa...
Online Patient Access to their Medical Record and Health Providers is Associa...Online Patient Access to their Medical Record and Health Providers is Associa...
Online Patient Access to their Medical Record and Health Providers is Associa...HMO Research Network
 
Documentations of Advanced Heath Care Directives Where Are They TAI_SEALE
Documentations of Advanced Heath Care Directives Where Are They TAI_SEALEDocumentations of Advanced Heath Care Directives Where Are They TAI_SEALE
Documentations of Advanced Heath Care Directives Where Are They TAI_SEALEHMO Research Network
 

More from HMO Research Network (20)

New Rules Dealing with Conflicts of Interest in Public Health Service Funded ...
New Rules Dealing with Conflicts of Interest in Public Health Service Funded ...New Rules Dealing with Conflicts of Interest in Public Health Service Funded ...
New Rules Dealing with Conflicts of Interest in Public Health Service Funded ...
 
From Populations to Patients
From Populations to PatientsFrom Populations to Patients
From Populations to Patients
 
Evaluation of the Validity of the Gestational Length Assumptions Based Upon A...
Evaluation of the Validity of the Gestational Length Assumptions Based Upon A...Evaluation of the Validity of the Gestational Length Assumptions Based Upon A...
Evaluation of the Validity of the Gestational Length Assumptions Based Upon A...
 
Comparative Safety of Infliximaband Etanercept on the Risk of Serious Infecti...
Comparative Safety of Infliximaband Etanercept on the Risk of Serious Infecti...Comparative Safety of Infliximaband Etanercept on the Risk of Serious Infecti...
Comparative Safety of Infliximaband Etanercept on the Risk of Serious Infecti...
 
A Multi State Markov Model for Analyzing Patterns of Use of Opiod Treatments ...
A Multi State Markov Model for Analyzing Patterns of Use of Opiod Treatments ...A Multi State Markov Model for Analyzing Patterns of Use of Opiod Treatments ...
A Multi State Markov Model for Analyzing Patterns of Use of Opiod Treatments ...
 
A Descriptive Study of Vaccinations Occuring During Pregnancy HENNINGER
A Descriptive Study of Vaccinations Occuring During Pregnancy HENNINGERA Descriptive Study of Vaccinations Occuring During Pregnancy HENNINGER
A Descriptive Study of Vaccinations Occuring During Pregnancy HENNINGER
 
The Use of Administrative Data and Natural Language Processing to Estimate th...
The Use of Administrative Data and Natural Language Processing to Estimate th...The Use of Administrative Data and Natural Language Processing to Estimate th...
The Use of Administrative Data and Natural Language Processing to Estimate th...
 
Patient Views of KRAS Testing for Treatment of Metastatic Colorectal Cancer L...
Patient Views of KRAS Testing for Treatment of Metastatic Colorectal Cancer L...Patient Views of KRAS Testing for Treatment of Metastatic Colorectal Cancer L...
Patient Views of KRAS Testing for Treatment of Metastatic Colorectal Cancer L...
 
Comparative Effectiveness of Chemotherapy Regimens for Advanced Lung Cancer C...
Comparative Effectiveness of Chemotherapy Regimens for Advanced Lung Cancer C...Comparative Effectiveness of Chemotherapy Regimens for Advanced Lung Cancer C...
Comparative Effectiveness of Chemotherapy Regimens for Advanced Lung Cancer C...
 
CER HUB An Informatics Platform for Conducting Compartive Effectiveness with ...
CER HUB An Informatics Platform for Conducting Compartive Effectiveness with ...CER HUB An Informatics Platform for Conducting Compartive Effectiveness with ...
CER HUB An Informatics Platform for Conducting Compartive Effectiveness with ...
 
An Application of Doubly Robust Estimation JOHNSON
An Application of Doubly Robust Estimation JOHNSONAn Application of Doubly Robust Estimation JOHNSON
An Application of Doubly Robust Estimation JOHNSON
 
Risk Factors for Short Term Virologic Outcomes Among HIV Infected Patients Un...
Risk Factors for Short Term Virologic Outcomes Among HIV Infected Patients Un...Risk Factors for Short Term Virologic Outcomes Among HIV Infected Patients Un...
Risk Factors for Short Term Virologic Outcomes Among HIV Infected Patients Un...
 
Expanding SEER Reporting with Comorbidity Data Colorectal Cancer HORNBROOK
Expanding SEER Reporting with Comorbidity Data Colorectal Cancer HORNBROOKExpanding SEER Reporting with Comorbidity Data Colorectal Cancer HORNBROOK
Expanding SEER Reporting with Comorbidity Data Colorectal Cancer HORNBROOK
 
Drug Characteristics Associated with Medication Adherence Across Eight Diseas...
Drug Characteristics Associated with Medication Adherence Across Eight Diseas...Drug Characteristics Associated with Medication Adherence Across Eight Diseas...
Drug Characteristics Associated with Medication Adherence Across Eight Diseas...
 
Feasibility of Implementing Screening Brief Intervention and Referral to Trea...
Feasibility of Implementing Screening Brief Intervention and Referral to Trea...Feasibility of Implementing Screening Brief Intervention and Referral to Trea...
Feasibility of Implementing Screening Brief Intervention and Referral to Trea...
 
eCare for Heart Wellness A Trial to Test the Feasibility of Web Based Dietici...
eCare for Heart Wellness A Trial to Test the Feasibility of Web Based Dietici...eCare for Heart Wellness A Trial to Test the Feasibility of Web Based Dietici...
eCare for Heart Wellness A Trial to Test the Feasibility of Web Based Dietici...
 
A Telephone Based Diabetes Prevention Program and Social Support for Weight L...
A Telephone Based Diabetes Prevention Program and Social Support for Weight L...A Telephone Based Diabetes Prevention Program and Social Support for Weight L...
A Telephone Based Diabetes Prevention Program and Social Support for Weight L...
 
Technological Resources & Personnel Costs Required to Implement an Automated ...
Technological Resources & Personnel Costs Required to Implement an Automated ...Technological Resources & Personnel Costs Required to Implement an Automated ...
Technological Resources & Personnel Costs Required to Implement an Automated ...
 
Online Patient Access to their Medical Record and Health Providers is Associa...
Online Patient Access to their Medical Record and Health Providers is Associa...Online Patient Access to their Medical Record and Health Providers is Associa...
Online Patient Access to their Medical Record and Health Providers is Associa...
 
Documentations of Advanced Heath Care Directives Where Are They TAI_SEALE
Documentations of Advanced Heath Care Directives Where Are They TAI_SEALEDocumentations of Advanced Heath Care Directives Where Are They TAI_SEALE
Documentations of Advanced Heath Care Directives Where Are They TAI_SEALE
 

Validation of a Natural Language Processing Protocol for Detecting Heart Failure Sins in Electronic Health Record Notes BYRD

  • 1. Validation of a Natural Language Processing Protocol for Detecting Heart Failure Signs and Symptoms in Electronic Health Record Text Notes Roy J. Byrd2, Steven R. Steinhubl1, Jimeng Sun2, Shahram Ebadollahi2, Zahra Daar1, Walter F. Stewart1 1Geisinger Medical Center, Center for Health Research, Danville, PA 2 IBM, T.J. Watson Research Center, Hawthorne, NY
  • 2. Outline • Background and objectives • Datasets • Tools & Methods • Results • Discussion – Challenges – Opportunities • Summary • (Iterative annotation refinement)
  • 3. Background and Objectives • Background – Framingham criteria for HF published in 1971 – Geisinger/IBM “PredMED” project on predictive modeling for early detection of HF, using longitudinal EHRs • Overall Project Objective Better understand the presentation of HF in the primary care setting, in order to facilitate its more rapid identification and treatment • Objective of this paper: Build and validate NLP extractors for Framingham criteria (signs and symptoms) from EHR clinical notes, so that they may be suitable for downstream diagnostic applications
  • 4. Framingham HF Diagnostic Criteria MAJOR SYMPTOMS MINOR SYMPTOMS 1. Paroxysmal Nocturnal Dyspnea 1. Bilateral Ankle Edema (PND) or Orthopnea 2. Neck Vein Distension (JVD) 2. Nocturnal Cough 3. Rales 3. Dyspnea on ordinary exertion 4. Radiographic Cardiomegaly 4. Hepatomegaly 5. Acute Pulmonary Edema 5. Pleural effusion 6. A decrease in vital capacity by 1/3 6. S3 Gallop of the maximal value recorded** 7. Increased Central Venous Pressure 7. Tachycardia (>120 BPM) (> 16 cm H2O at RA) 8. Circulation Time of 25 seconds** 9. Hepatojugular Reflux (HJR) ** Not extracted, since these criteria are not documented in routine 10.Weight loss 4.5kg in 5 days in clinical practice. response to treatment N Engl J Med. 1971;285:1441-1446.
  • 5. (Sample downstream analysis) Reports of Framingham HF criteria in the year prior to diagnosis Percent with Documented Criteria 60 50 Cases (N=4,644) Controls (N=45,981) 40 62.3 65 30 20 28.6 22.9 10 17.2 17.9 17.7 7.2 5.8 5.2 1.7 1.4 0.7 1.1 0 PND Rales JVD Pulm CMegaly Ankle DOE Edema Edema
  • 6. Datasets • Clinical notes from longitudinal (2001-2010) EHR encounters for – 6,355 case patients • Meet operational criteria for HF** – 26,052 control patients • Clinic-, gender- and age-matched to cases – The case-control distinction is exploited in downstream applications; it’s not relevant for criteria extraction. • Development dataset **Operational HF Criteria – 65 encounter notes –HF diagnosis on • Selected for density of Framingham criteria problem list, • Annotated by a clinical expert –HF diagnosis in EHR for two outpatient • Validation dataset encounters, –Two or more – 400 encounter notes (200 cases & 200 controls) medications with ICD- • Randomly selected 9 code for HF, or • Annotated by consensus of 4 trained coders –One HF diagnosis and one medication with • N = 1492 criteria ICD-9 code for HF
  • 7. Tools • LRW1 – LanguageWare Resource Workbench UIMA Collection Processing Engine – Basic Text Processing Encounter – Dictionaries for Basic Processing Dictionaries and Grammars Text Analysis Engines Extracted paragraphs, sentences, for recognizing criteria for applying constraints Documents Criteria – Grammars etc. tokenization, candidates and annotating criteria • UIMA2 - Unstructured Information Management Architecture – Execution Pipeline, including I/O management – Text Analysis Engines • TextSTAT3 – Simple Text Analysis Tool – Concordance program, used for linguistic analysis 1http://www/alphaworks.ibm.com/tech/lrw 2http://uima.apache.org 3http://neon.niederlandistik.fu-berlin.de/en/textstat
  • 8. Criteria Extraction Methods: Dictionaries • Framingham Criteria • Negating words vocabulary – Used to deny criteria – Words and phrases used to • no, free of, ruled out mention the 15 Framingham Criteria • Counterfactual triggers – The criteria may not have – edema, leg occurred edema, oedema; shortness of breath, SOB • if, should, as needed for – Size: ~75 “lemma forms” • Miscellaneous Classes (main entries) and – Weight loss phrases hundreds of variant forms • lose weight, diurese • Segment Header words – Time value words and phrases • day, week, month – Patient – Weight units History, Examination, Plan, • pound, kilogram Instruction – Diuretics • Bumex, Furosimide
  • 9. Criteria Extraction Methods: Grammars • Shallow English syntax • Negated Scope – Noun Phrases – regular rate and rhythm • some moderate DOE without – Compound Noun Phrases murmurs, clicks, gallops, o r rubs • chest pain, DOE, or night cough • Counterfactual Scope – Prepositional Phrases – Patient should call if she • No full-sentential parses experiences shortness of breath – Not needed for simple HF criteria • Weight Loss – Unreliable sentence – 20 pound weight loss in a boundaries and syntax in week with diuretics clinical notes • Tachycardia – tachy at 120 (to 130) – HR: 135
  • 10. Criteria Extraction Methods: Text Analysis Engines (TAEs) • Rules to filter candidate • Co-occurrence criteria created from constraints dictionaries and – exercise HR: 135 doesn’t grammars. affirm Tachycardia • Deny criteria mentioned • Disambiguation in negated contexts – edema is recognized as – regular rate and rhythm APEdema, if near cxr, or in without murmurs, clicks, a “Radiology” note, or in a gallops, or rubs  S3Neg “Chest X-Ray” segment • Ignore criteria in • Numeric constraints counterfactual contexts – she lost 5 pounds over a month doesn’t affirm – Patient should call if she WeightLoss experiences shortness of breath – tachy @ 115 doesn’t affirm Tachycardia
  • 11. Encounter Labeling Methods • We can label an encounter note with labels showing the criteria that the note mentions – The labels can be used by downstream analyses to gather information such as: “This patient exhibited those symptoms on that date.” • 2 Methods: – Machine-learning • Using candidate criteria and scope annotations, as features, … • use a [CHAID decision tree] classifier to assign criteria as labels. – Rule-based • Run the full extractor pipeline, then … • Assign labels consisting of all unique criteria that survive filtering.
  • 13. Evaluation Flow Metrics: Machine Encounter Learning Labels Precision (Positive Predictive Value): Lexical Lexical Encounter Encounter #TruePositive / (#TruePositive &+Scope Look-up #FalsePositive) Label Documents & Scope Annotations Evaluation Recall (Sensitivity): Encounter Rules #TruePositive / (#TruePositive + #FalseNegative) Labels F-Score (the harmonic mean of Precision and Recall): (2 x Precision x Recall) / (Precision + Recall) Criteria
  • 14. Encounter Labeling Performance Machine-learning method Rule-based method Recall Precision F-Score Recall Precision F-Score Affirmed 0.675000 0.754190 0.712401 0.738532 0.899441 0.811083 Denied 0.945556 0.905319 0.925000 0.987599 0.931915 0.958949 Overall 0.896364 0.881144 0.888689 0.938462 0.926720 0.932554 Overall 99% (0.848-0.929) (0.900-0.964) Conf. Int. Conclusion: Machine-learning labeling does not significantly underperform rule-based labeling.
  • 15. Performance of Framingham Diagnostic Criteria Extraction 99% Confidence Precision Recall F-score Interval (F-score) Overall (exact) 0.925234 0.896864 0.910828 (0.891 - 0.929) Overall (relaxed) 0.948239 0.919164 0.933475 (0.916 - 0.950) Affirmed 0.747801 0.789474 0.768072 (0.711 - 0.824) Denied 0.982857 0.928058 0.954672 (0.938 - 0.970) Note: Performance on affirmed criteria is worse, possibly because of their greater syntactic diversity. For example, we don’t find: PleuralEffusion: blunting of the right costrophrenic angle DOExertion: she felt like she couldn’t get enough air in
  • 16. Precision and Recall for Individual Criteria
  • 17. Analysis of 1492 extracted criteria: PredMED extractions vs. Gold Standard annotations e tiv ED eg KE td si E g TL g g AP DN EP g D Ne W Ne R eg Po H eg R Ne TA eg JV e g N eg PN eg AN dS AN D e PL g S3 g EN N N KE ED e e N H H N E E N EN e N N D D ol EP G G C C AL AL JR JR E D D ls O O C C C C PredMED PN AP TA PL S3 JV Fa G D H H H N R R ANKED 90 6 16 ANKEDNeg 230 6 APED 8 5 2 1 22 APEDNeg 0 DOE 116 17 1 3 DOENeg 3 135 2 1 HEP 0 1 HEPNeg 125 HJR 2 1 HJRNeg 9 JVD 7 2 JVDNeg 91 NC 2 NCNeg 43 2 PLE 8 PLENeg 1 PND 1 7 2 PNDNeg 69 RALE 11 1 RALENeg 197 RC 6 RCNeg 1 S3G 0 S3GNeg 131 TACH 1 2 TACHNeg 0 4 WTL 0 False Negative 6 8 5 2 6 5 1 4 1 3 2 2 7 35 2 1 1 10
  • 18. Discussion • Challenges • Opportunities – Data quality: EHR text data is – We can apply similar messy. techniques to other collections • >10% (i.e., 26/237) of the of criteria. errors are caused by • NY Heart Association misspellings & bad sentence • European Society of boundaries Cardiology – Human anatomy • MedicalCriteria.com • We need a better solution – Many specific criteria than word co-occurrence extractors can be re-used in constraints other settings. – Syntactic diversity of affirmed criteria • We need deeper syntactic – For downstream applications, and semantic analysis see posters and presentations – Contradictions and from our project at this redundancy conference • An issue for downstream analysis
  • 19.
  • 20. Summary • Extractors can identify affirmations and denials of Framingham HF criteria in EHR clinical notes with an overall F-Score of 0.91. • Classifiers can label EHR encounters with the Framingham critera they mention with an F- Score of 0.93. • Information about HF criteria mentioned in EHR notes appears to be useful for downstream applications that seek to achieve early detection of HF.
  • 22. Iterative Annotation Refinement • What are the problems solved? – Annotations are required for training and evaluating criteria extractors. – Human annotators without guidelines have high precision but lower recall. – Domain experts’ intuitions (about the language for expressing criteria) are initially imprecise. • What is produced? – Annotated dataset – Annotation guidelines … that are consistent – Criteria extractors
  • 23. The Development Process: Iterative Annotation Refinement Initialization Results Iteration Update the Expert Write Annotations annotations initial and the Expert guidelines guidelines Discuss the Annotation Annotate texts Perform language Encounter Guidelines with current error of HF Texts extractors analysis criteria Build Criteria Update the initial Extractors extractors extractors Linguist
  • 24. User interface for the annotation tool, which was used to manage annotations during refinement.
  • 25. Performance improvement during development Performance comparison Final PredMED Clinical Expert 1 Ini al 0.9 Final 0.8 Precision Ini al 0.7 0.6 0.5 0.5 0.6 0.7 0.8 0.9 1 Recall
  • 26. Iterative methods for creating annotations, guidelines, and extractors Extraction Result of using Sources of Arbiter for Objective (and target the method annotations disagreements metric) for each compared in at each iteration each iteration iteration Iterative Framingham - Annotations Expert and Expert Improve extractor Annotation HF criteria - Guidelines Extractor performance (F- Refinement - Extractor score) Annotation Clinical - Guidelines (in Expert and Consensus Improve inter- Induction conditions the form of an Linguist annotator (Chapman, et annotation agreement (F- al. J Biom Inf schema) score) 2006) CDKRM Classes in the - Annotations 2 Experts Consensus Improve inter- (Coden, et al., cancer disease - Guidelines annotator J Biom Inf model agreement 2009) (agreement %) TALLAL PHI (protected - Annotations Expert and Expert Annotate full (Carrell, et al, health - Extractor Extractor dataset (to the GHRI-IT information) expert’s poster, 2010) classes satisfaction)