SlideShare una empresa de Scribd logo
1 de 1
Comparison of Association Rule Mining and
                                         Crowdsourcing for Automated Generation of a
                                            Problem-Medication Knowledge Base
                                                  Allison B. McCoy, PhD1, Adam Wright, PhD2, Dean F. Sittig, PhD1
                                                  1The   University of Texas Health Science Center at Houston – School of Biomedical Informatics
                                                                       2Brigham and Women’s Hospital, Harvard Medical School




                   Objective                                    Knowledge Base Generation                                            Results                                                                                                   Figure 2




                                                                                                                                                                                                               60000
To compare the use of association rule mining            For association rule mining, we used                     Association rule mining identified 19,586
and crowdsourcing to generate a problem-                 minimum support and confidence thresholds                pairs, including 2,920 distinct medications




                                                                                                                                          Association Rule Mining (Chi-Squared)
medication knowledge base using a single                 of 5 and 10%, respectively to identify related           and 4,759 distinct problems. Crowdsourcing
source of clinical data from a commercially              problem-medication pairs. We ranked pairs                identified 31,440 pairs, including 2,756




                                                                                                                                                                                                               40000
available electronic health record (EHR).                using the chi-squared statistic, which                   distinct medications and 4,675 distinct
                                                         performed best when compared to a gold                   problems. Spearman’s rho comparing
                 Introduction                            standard in our previous analysis.                       overlapping pairs was 0.539, with p <




                                                                                                                                                                                                               20000
                                                                                                                  0.0001 (Fig. 2). Of the top 500 ranked pairs for
Increased amounts of EHR data has led to
                                                         For crowdsourcing, we retrieved links                    both approaches, 186 overlapped (Fig. 3).
inefficiencies for clinicians trying to locate
                                                         between medications and problems asserted
relevant patient information. Automated
                                                         by clinicians during e-prescribing. We                   Of the top-ranked association rule mining




                                                                                                                                                                                                                        0
summarization tools that create condition-                                                                                                                                                                                  0           50                100               150        200
                                                         performed a logistic regression using a subset           pairs, nine were also identified through                                                                          Crowdsourcing (Logistic Regression Fitted Value)
specific data displays rather than current
                                                         of pairs that were manually reviewed for                 crowdsourcing. Support and confidence
displays by data type may improve clinician                                                                                                                                                                            Overlapping problem-medication pairs in association
                                                         appropriateness. We ranked each pair i using             varied, as did the number of patients having                                                                   rule mining and crowdsourcing
efficiency. However, these tools require new
                                                         the resulting predictor function, where pi               the pair linked and the ratio of linked pairs for
clinical knowledge (e.g., problem-medication
                                                         represents the number of patients having pair i          those also identified through crowdsourcing.
relationships) that is difficult to obtain.
                                                         and ri represents the ratio of patients having           All top-ranked crowdsourcing pairs had a                                                                                     Figure 3
Approaches to automatically generating this
                                                         the pair linked to the number of patients                corresponding association rule mining
knowledge include:
                                                         having both the medication and problem in                rank. The number of patients having the pair
•   Standards-based ontologies, such as                  pair i co-occurring.                                     linked was greater than 500 patients for all
    NDF-RT, a reference terminology for                           ƒ(i) = 0.14 * pi + 2.34 * ri + 2.45             pairs, while the ratio of linked pairs had a wide
    medications that provides a formal content                                                                    range. Support for the association rule mining                                                                    314              184               314
    model to describe medications and                           Comparison of Approaches                          was high and confidence varied.
    definitional relationships. However,
    mapping of EHR data to standard                      We computed Spearman’s rho to test the                   Top-ranked pairs uniquely identified through
    terminologies can be problematic                     correlation between the association rule                 association rule mining included some
                                                         mining and crowdsourcing. For the 500 top-               rarely prescribed medications (e.g.,                                                                      Overlap between association rule mining and
•   Association rule mining, a method of                 ranked problem-medication pairs for each                                                                                                                                   crowdsourcing approaches.
                                                                                                                  glycopyrrolate) and non-clinical problems
    data mining that identifies related concepts         approach, we then determined the number of               (e.g., taking medication). Top-ranked pairs
    using measures of interestingness and has            pairs that existed in both sets and the                  uniquely identified through crowdsourcing                                                                                 Discussion
    been previously used to identify                     number of pairs that were unique. We                     included commonly prescribed                                                                     Both approaches effectively identified
    relationships between clinical data                  manually inspected the top-ranked pairs to               medications with secondary indicated                                                             related pairs; crowdsourcing likely identified
    elements                                             classify the types best identified by each               problems (e.g., metformin and polycystic                                                         more because we did not restrict inclusion,
•   Crowdsourcing, defined as outsourcing a              approach.                                                ovarian syndrome).                                                                               while for association rule mining we set
    task to a group of people, which takes                                                                                                                                                                         support and confidence thresholds. Review of
    advantage of manually linked laboratory                        Top-Ranked Association Rule Mining Problem-Medication Pairs                                                                                     overlaps between approaches found a heavy
    tests to clinical problems by clinicians                                                                                                                                               Number Ratio of         positive skew when comparing number of
    during standard EHR e-ordering, a task                                                                                       Crowdsourcing
                                                         Rank       Medication             Problem            Support Confidence                                                              of    Linked         pairs included with the percentage of overlap,
                                                                                                                                     Rank
    required by many institutions for billing                                                                                                                                              Patients Pairs          suggesting that the percentage of overlap
    (Fig. 1)                                                1 Permethrin         Scabies                          125      0.874                                                  108          100       0.8       increases as the number of pairs included
                                                            2 MetroNIDAZOLE      Bacterial Vaginosis             1061      0.563                                                       4      1003     0.945       increases until a certain threshold, at which
                    Figure 1                                3 Rilutek            Motor Neuron Disease               5      0.833                                              13543                3     0.6       point both approaches become less accurate.
                                                            4 Terconazole        Vaginal Candidiasis              404      0.599                                                      20       388     0.960
                                                                                 Pseudomonas Wound                                                                                                                 Some limitations of this work include the use
                                                            5 Amikacin Sulfate                                      5            1                                                N/A          N/A      N/A
                                                                                 Infection                                                                                                                         of a single source of data that may not be
                                                              Glucagon           Type I Diabetes Mellitus -                                                                                                        directly utilized by other EHR systems; the use
                                                            6                                                     115      0.762                                                  113             96   0.835
                                                              Emergency          Uncontrolled
                                                              Levothyroxine
                                                                                                                                                                                                                   of structured elements, which may be
                                                            7                    Hypothyroidism                  1865      0.675                                                       1      1396     0.749       incomplete compared of narrative text; and the
                                                              Sodium
                                                                                 Disorder Of Mitochondrial                                                                                                         lack of an evaluation of the appropriateness
                                                            8 LevOCARNitine                                        65     0.3476                                                  240             53   0.815
                                                                                 Metabolism                                                                                                                        of the identified pairs.
                                                                Griseofulvin
                                                            9                    Tinea Capitis                     99      0.846                                                  171             74   0.747
                                                                Microsize
                                                           10 Solu-CORTEF
                                                                                 Congenital Adrenal
                                                                                                                   16      0.552                                                  663             14   0.875
                                                                                                                                                                                                                                Summary of Conclusions
                                                                                 Hyperplasia
                                                                                                                                                                                                                   Association rule mining and crowdsourcing
                                                                                                                                                                                                                   are effective, complementary approaches
                                                                           Top-Ranked Crowdsourcing Problem-Medication Pairs                                                                                       for automatically generating a problem-
Sample screen for linking a medication to an indicated                                                        Number Ratio of    Association                                                                       medication knowledge base, which can be
           problem during e-prescribing.                 Rank       Medication             Problem               of    Linked        Rule                                             Support Confidence           used to improve clinical care through
                                                                                                              Patients Pairs     Mining Rank                                                                       summary screens. Further research is
                                                              Levothyroxine                                                                                                                                        necessary to combine and better evaluate the
                                                            1                    Hypothyroidism                  1396    0.749                                                    7        1865        0.675
         Study Setting and Data                               Sodium
                                                                                                                                                                                                                   approaches to generate an all-inclusive, highly
                                                            2 Simvastatin        Hyperlipidemia                  1152    0.651                                        102                  1769        0.609
We collected data from a large, multi-                                                                                                                                                                             accurate problem-medication knowledge
                                                            3 Lisinopril         Hypertension                    1045    0.402                                        315                  2598        0.590
specialty, academic practice that provides                                                                                                                                                                         base.
                                                            4 MetroNIDAZOLE      Bacterial Vaginosis             1003    0.945                                                    2        1061        0.563
ambulatory care throughout Houston, TX.
                                                            5 Lipitor            Hyperlipidemia                   865    0.563                                        143                  1537        0.591                        Acknowledgments
Clinicians utilized Allscripts Enterprise EHR to
                                                            6 Hydrochlorothiazide Hypertension                    731    0.429                                        468                  1703        0.625       This project was supported by Grant No. 10510592 for
maintain patient notes and problem lists, order                                                                                                                                                                    Patient-Centered Cognitive Support under the Strategic
                                                              AmLODIPine
laboratory tests, and prescribe medications.                7
                                                              Besylate
                                                                                  Hypertension                    699    0.422                                        460                  1658        0.635       Health IT Advanced Research Projects (SHARP) from the
During the one year study period, clinicians                                                                                                                                                                       Office of the National Coordinator for Health Information
                                                              Fluticasone                                                                                                                                          Technology and NCRR grant 3UL1RR024148.
                                                            8                     Allergic Rhinitis               630    0.699                                        150                   901        0.525
entered 418,221 medications and 1,222,308                     Propionate
problems for 53,108 patients.                               9 NexIUM              Esophageal Reflux               561    0.613                                             211              915        0.352            Please contact the first author via email:
                                                           10 MetFORMIN HCl      Diabetes Mellitus                566    0.353                                        437                  1605        0.544                 allison.b.mccoy@uth.tmc.edu

Más contenido relacionado

Más de Allison McCoy

Comparative Analysis of Association Rule Mining, Crowdsourcing, and NDF-RT Kn...
Comparative Analysis of Association Rule Mining, Crowdsourcing, and NDF-RT Kn...Comparative Analysis of Association Rule Mining, Crowdsourcing, and NDF-RT Kn...
Comparative Analysis of Association Rule Mining, Crowdsourcing, and NDF-RT Kn...Allison McCoy
 
Improving Lab Order, Verification, and Follow-up Processes at UT Physicians
Improving Lab Order, Verification, and Follow-up Processes at UT PhysiciansImproving Lab Order, Verification, and Follow-up Processes at UT Physicians
Improving Lab Order, Verification, and Follow-up Processes at UT PhysiciansAllison McCoy
 
Use of the Crowdsourcing Methodology to Generate a Problem-Laboratory Test Kn...
Use of the Crowdsourcing Methodology to Generate a Problem-Laboratory Test Kn...Use of the Crowdsourcing Methodology to Generate a Problem-Laboratory Test Kn...
Use of the Crowdsourcing Methodology to Generate a Problem-Laboratory Test Kn...Allison McCoy
 
A Prototype Knowledge Base and SMART App to Facilitate Organization of Patien...
A Prototype Knowledge Base and SMART App to Facilitate Organization of Patien...A Prototype Knowledge Base and SMART App to Facilitate Organization of Patien...
A Prototype Knowledge Base and SMART App to Facilitate Organization of Patien...Allison McCoy
 
Automated Inference of Patient Problems from Medications using NDF-RT and the...
Automated Inference of Patient Problems from Medications using NDF-RT and the...Automated Inference of Patient Problems from Medications using NDF-RT and the...
Automated Inference of Patient Problems from Medications using NDF-RT and the...Allison McCoy
 
The Greasemonkey Firefox Add-On for Altering Display of Data in a Web-Based E...
The Greasemonkey Firefox Add-On for Altering Display of Data in a Web-Based E...The Greasemonkey Firefox Add-On for Altering Display of Data in a Web-Based E...
The Greasemonkey Firefox Add-On for Altering Display of Data in a Web-Based E...Allison McCoy
 
A System to Improve Medication Safety in the Setting of Acute Kidney Injury
A System to Improve Medication Safety in the Setting of Acute Kidney InjuryA System to Improve Medication Safety in the Setting of Acute Kidney Injury
A System to Improve Medication Safety in the Setting of Acute Kidney InjuryAllison McCoy
 
Real-Time Surveillance for Rapid Correction of Clinical Decision Support Fail...
Real-Time Surveillance for Rapid Correction of Clinical Decision Support Fail...Real-Time Surveillance for Rapid Correction of Clinical Decision Support Fail...
Real-Time Surveillance for Rapid Correction of Clinical Decision Support Fail...Allison McCoy
 

Más de Allison McCoy (8)

Comparative Analysis of Association Rule Mining, Crowdsourcing, and NDF-RT Kn...
Comparative Analysis of Association Rule Mining, Crowdsourcing, and NDF-RT Kn...Comparative Analysis of Association Rule Mining, Crowdsourcing, and NDF-RT Kn...
Comparative Analysis of Association Rule Mining, Crowdsourcing, and NDF-RT Kn...
 
Improving Lab Order, Verification, and Follow-up Processes at UT Physicians
Improving Lab Order, Verification, and Follow-up Processes at UT PhysiciansImproving Lab Order, Verification, and Follow-up Processes at UT Physicians
Improving Lab Order, Verification, and Follow-up Processes at UT Physicians
 
Use of the Crowdsourcing Methodology to Generate a Problem-Laboratory Test Kn...
Use of the Crowdsourcing Methodology to Generate a Problem-Laboratory Test Kn...Use of the Crowdsourcing Methodology to Generate a Problem-Laboratory Test Kn...
Use of the Crowdsourcing Methodology to Generate a Problem-Laboratory Test Kn...
 
A Prototype Knowledge Base and SMART App to Facilitate Organization of Patien...
A Prototype Knowledge Base and SMART App to Facilitate Organization of Patien...A Prototype Knowledge Base and SMART App to Facilitate Organization of Patien...
A Prototype Knowledge Base and SMART App to Facilitate Organization of Patien...
 
Automated Inference of Patient Problems from Medications using NDF-RT and the...
Automated Inference of Patient Problems from Medications using NDF-RT and the...Automated Inference of Patient Problems from Medications using NDF-RT and the...
Automated Inference of Patient Problems from Medications using NDF-RT and the...
 
The Greasemonkey Firefox Add-On for Altering Display of Data in a Web-Based E...
The Greasemonkey Firefox Add-On for Altering Display of Data in a Web-Based E...The Greasemonkey Firefox Add-On for Altering Display of Data in a Web-Based E...
The Greasemonkey Firefox Add-On for Altering Display of Data in a Web-Based E...
 
A System to Improve Medication Safety in the Setting of Acute Kidney Injury
A System to Improve Medication Safety in the Setting of Acute Kidney InjuryA System to Improve Medication Safety in the Setting of Acute Kidney Injury
A System to Improve Medication Safety in the Setting of Acute Kidney Injury
 
Real-Time Surveillance for Rapid Correction of Clinical Decision Support Fail...
Real-Time Surveillance for Rapid Correction of Clinical Decision Support Fail...Real-Time Surveillance for Rapid Correction of Clinical Decision Support Fail...
Real-Time Surveillance for Rapid Correction of Clinical Decision Support Fail...
 

Último

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Último (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Comparison of Association Rule Mining and Crowdsourcing for Automated Generation of a Problem-Medication Knowledge Base

  • 1. Comparison of Association Rule Mining and Crowdsourcing for Automated Generation of a Problem-Medication Knowledge Base Allison B. McCoy, PhD1, Adam Wright, PhD2, Dean F. Sittig, PhD1 1The University of Texas Health Science Center at Houston – School of Biomedical Informatics 2Brigham and Women’s Hospital, Harvard Medical School Objective Knowledge Base Generation Results Figure 2 60000 To compare the use of association rule mining For association rule mining, we used Association rule mining identified 19,586 and crowdsourcing to generate a problem- minimum support and confidence thresholds pairs, including 2,920 distinct medications Association Rule Mining (Chi-Squared) medication knowledge base using a single of 5 and 10%, respectively to identify related and 4,759 distinct problems. Crowdsourcing source of clinical data from a commercially problem-medication pairs. We ranked pairs identified 31,440 pairs, including 2,756 40000 available electronic health record (EHR). using the chi-squared statistic, which distinct medications and 4,675 distinct performed best when compared to a gold problems. Spearman’s rho comparing Introduction standard in our previous analysis. overlapping pairs was 0.539, with p < 20000 0.0001 (Fig. 2). Of the top 500 ranked pairs for Increased amounts of EHR data has led to For crowdsourcing, we retrieved links both approaches, 186 overlapped (Fig. 3). inefficiencies for clinicians trying to locate between medications and problems asserted relevant patient information. Automated by clinicians during e-prescribing. We Of the top-ranked association rule mining 0 summarization tools that create condition- 0 50 100 150 200 performed a logistic regression using a subset pairs, nine were also identified through Crowdsourcing (Logistic Regression Fitted Value) specific data displays rather than current of pairs that were manually reviewed for crowdsourcing. Support and confidence displays by data type may improve clinician Overlapping problem-medication pairs in association appropriateness. We ranked each pair i using varied, as did the number of patients having rule mining and crowdsourcing efficiency. However, these tools require new the resulting predictor function, where pi the pair linked and the ratio of linked pairs for clinical knowledge (e.g., problem-medication represents the number of patients having pair i those also identified through crowdsourcing. relationships) that is difficult to obtain. and ri represents the ratio of patients having All top-ranked crowdsourcing pairs had a Figure 3 Approaches to automatically generating this the pair linked to the number of patients corresponding association rule mining knowledge include: having both the medication and problem in rank. The number of patients having the pair • Standards-based ontologies, such as pair i co-occurring. linked was greater than 500 patients for all NDF-RT, a reference terminology for ƒ(i) = 0.14 * pi + 2.34 * ri + 2.45 pairs, while the ratio of linked pairs had a wide medications that provides a formal content range. Support for the association rule mining 314 184 314 model to describe medications and Comparison of Approaches was high and confidence varied. definitional relationships. However, mapping of EHR data to standard We computed Spearman’s rho to test the Top-ranked pairs uniquely identified through terminologies can be problematic correlation between the association rule association rule mining included some mining and crowdsourcing. For the 500 top- rarely prescribed medications (e.g., Overlap between association rule mining and • Association rule mining, a method of ranked problem-medication pairs for each crowdsourcing approaches. glycopyrrolate) and non-clinical problems data mining that identifies related concepts approach, we then determined the number of (e.g., taking medication). Top-ranked pairs using measures of interestingness and has pairs that existed in both sets and the uniquely identified through crowdsourcing Discussion been previously used to identify number of pairs that were unique. We included commonly prescribed Both approaches effectively identified relationships between clinical data manually inspected the top-ranked pairs to medications with secondary indicated related pairs; crowdsourcing likely identified elements classify the types best identified by each problems (e.g., metformin and polycystic more because we did not restrict inclusion, • Crowdsourcing, defined as outsourcing a approach. ovarian syndrome). while for association rule mining we set task to a group of people, which takes support and confidence thresholds. Review of advantage of manually linked laboratory Top-Ranked Association Rule Mining Problem-Medication Pairs overlaps between approaches found a heavy tests to clinical problems by clinicians Number Ratio of positive skew when comparing number of during standard EHR e-ordering, a task Crowdsourcing Rank Medication Problem Support Confidence of Linked pairs included with the percentage of overlap, Rank required by many institutions for billing Patients Pairs suggesting that the percentage of overlap (Fig. 1) 1 Permethrin Scabies 125 0.874 108 100 0.8 increases as the number of pairs included 2 MetroNIDAZOLE Bacterial Vaginosis 1061 0.563 4 1003 0.945 increases until a certain threshold, at which Figure 1 3 Rilutek Motor Neuron Disease 5 0.833 13543 3 0.6 point both approaches become less accurate. 4 Terconazole Vaginal Candidiasis 404 0.599 20 388 0.960 Pseudomonas Wound Some limitations of this work include the use 5 Amikacin Sulfate 5 1 N/A N/A N/A Infection of a single source of data that may not be Glucagon Type I Diabetes Mellitus - directly utilized by other EHR systems; the use 6 115 0.762 113 96 0.835 Emergency Uncontrolled Levothyroxine of structured elements, which may be 7 Hypothyroidism 1865 0.675 1 1396 0.749 incomplete compared of narrative text; and the Sodium Disorder Of Mitochondrial lack of an evaluation of the appropriateness 8 LevOCARNitine 65 0.3476 240 53 0.815 Metabolism of the identified pairs. Griseofulvin 9 Tinea Capitis 99 0.846 171 74 0.747 Microsize 10 Solu-CORTEF Congenital Adrenal 16 0.552 663 14 0.875 Summary of Conclusions Hyperplasia Association rule mining and crowdsourcing are effective, complementary approaches Top-Ranked Crowdsourcing Problem-Medication Pairs for automatically generating a problem- Sample screen for linking a medication to an indicated Number Ratio of Association medication knowledge base, which can be problem during e-prescribing. Rank Medication Problem of Linked Rule Support Confidence used to improve clinical care through Patients Pairs Mining Rank summary screens. Further research is Levothyroxine necessary to combine and better evaluate the 1 Hypothyroidism 1396 0.749 7 1865 0.675 Study Setting and Data Sodium approaches to generate an all-inclusive, highly 2 Simvastatin Hyperlipidemia 1152 0.651 102 1769 0.609 We collected data from a large, multi- accurate problem-medication knowledge 3 Lisinopril Hypertension 1045 0.402 315 2598 0.590 specialty, academic practice that provides base. 4 MetroNIDAZOLE Bacterial Vaginosis 1003 0.945 2 1061 0.563 ambulatory care throughout Houston, TX. 5 Lipitor Hyperlipidemia 865 0.563 143 1537 0.591 Acknowledgments Clinicians utilized Allscripts Enterprise EHR to 6 Hydrochlorothiazide Hypertension 731 0.429 468 1703 0.625 This project was supported by Grant No. 10510592 for maintain patient notes and problem lists, order Patient-Centered Cognitive Support under the Strategic AmLODIPine laboratory tests, and prescribe medications. 7 Besylate Hypertension 699 0.422 460 1658 0.635 Health IT Advanced Research Projects (SHARP) from the During the one year study period, clinicians Office of the National Coordinator for Health Information Fluticasone Technology and NCRR grant 3UL1RR024148. 8 Allergic Rhinitis 630 0.699 150 901 0.525 entered 418,221 medications and 1,222,308 Propionate problems for 53,108 patients. 9 NexIUM Esophageal Reflux 561 0.613 211 915 0.352 Please contact the first author via email: 10 MetFORMIN HCl Diabetes Mellitus 566 0.353 437 1605 0.544 allison.b.mccoy@uth.tmc.edu