SlideShare a Scribd company logo
1 of 37
Data and Computational
 Challenges in Integrative
 Biomedical Informatics
Joel Saltz MD, PhD
Chair Department of Biomedical Informatics,
Director Center for Comprehensive Informatics
Emory University
Adjunct Professor CSE, CS
College of Computing, Georgia Tech
Center for Comprehensive Informatics




   ANALYTICS
   INTEGRATIVE DATA
Integrative Biomedical Informatics Analytics
Center for Comprehensive Informatics



                                       • Anatomic/functional
                                         characterization at fine
                                         level (Pathology) and
                                         gross level (Radiology)             Radiology
                                                                              Imaging
                                       • High throughput multi-
                                         scale image
                                         segmentation, feature                              Patient
                                                                    “Omic”
                                         extraction, analysis of                           Outcome
                                                                     Data
                                         features
                                       • Integration of
                                         anatomic/functional                  Pathologic
                                                                               Features
                                         characterization with
                                         multiple types of
                                         “omic” information
Integrative Spatio-Temporal Molecular Analytics
Center for Comprehensive Informatics




                                       • Aka Big Data
Quantitative Feature Analysis in Pathology:
Emory In Silico Center for Brain Tumor
Research (PI = Dan Brat, PD= Joel Saltz)
Using TCGA Data to Study
 Glioblastoma

Diagnostic Improvement

Molecular Classification

Predictors of Progression
Millions of Nuclei Defined by n Features


• Top-down analysis: use the features
  with existing diagnostic constructs

• Bottom-up analysis: let features define
  and drive the analysis
TCGA Whole Slide Images
    Step 1:
    Nuclei
                 • Identify individual nuclei
 Segmentation
                   and their boundaries




                Jun Kong
Nuclear Analysis Workflow
    Step 1:             Step 2:
    Nuclei              Feature
 Segmentation          Extraction




• Describe individual nuclei in terms of size,
  shape, and texture
Step 3:
    Nuclei
                  Nuclear Qualities
 Classification




 1                                         10

Oligodendroglioma                Astrocytoma
Comparison of Machine-based Classification
        to Human Based Classification




Separation of GBM, Oligo1, Oligo2   Separation of GBM, Oligo1 and
as Designated by                    Oligo2 as Designated by Machine
Neuropathologists
Survival Analysis




Human            Machine
Gene Expression Correlates of High Oligo-Astro
    Ratio on Machine-based Classification

                               Oligo Related Genes

                               Myelin Basic Protein
                               Proteolipoprotein
                               HoxD1



                               Nuclear features most
                               Associated with Oligo
                               Signature Genes:

                               Circularity (high)
                               Eccentricity (low)
Millions of Nuclei Defined by n Features


• Top-down analysis: analyze features in
  context of existing diagnostic constructs

• Bottom-up analysis: let nuclear features
  define and drive the analysis
Direct Study of Relationship Between
                                                 vs
Center for Comprehensive Informatics




                                                                              Lee Cooper,
                                                                              Carlos Moreno
Clustering identifies three morphological groups
Center for Comprehensive Informatics



                                       • Analyzed 200 million nuclei from 162 TCGA GBMs (462 slides)
                                       • Named for functions of associated genes:
                                         Cell Cycle (CC), Chromatin Modification (CM),
                                         Protein Biosynthesis (PB)
                                       • Prognostically-significant (logrank p=4.5e-4)


                                                                           CC   CM   PB
                                                                                                      1
                                                                                                                                           CC
                                                                      10                             0.8                                   CM
                                                                                                                                           PB
                                                                      20
                                                    Feature Indices




                                                                                                     0.6




                                                                                          Survival
                                                                      30                             0.4


                                                                      40                             0.2


                                                                      50
                                                                                                      0
                                                                                                           0   500   1000   1500   2000   2500   3000
                                                                                                                            Days
Center for Comprehensive Informatics

                                       Associations
Center for Comprehensive Informatics




   ANALYTICS
   HEALTHCARE DATA
Clinical Phenotype Characterization and the Emory
                                       Analytic Information Warehouse
Center for Comprehensive Informatics




                                       • Example Project: Find hot spots in readmissions within 30 days
                                          – What fraction of patients with a given principal diagnosis will be
                                            readmitted within 30 days?
                                          – What fraction of patients with a given set of diseases will be readmitted
                                            within 30 days?
                                          – How does severity and time course of co-morbidities affect
                                            readmissions?
                                          – Geographic analyses

                                       • Compare and contrast with UHC Clinical Data Base
                                          – Repeat analyses across all UHC hospitals
                                          – Are we performing the same?
                                          – How are UHC-curated groupings of patients (e.g., product lines) useful?

                                       • Need a repeatable process that we can apply identically to both
                                         local and UHC data
Overall System
Center for Comprehensive Informatics



                                                                                             Metadata
                                                                                            Repository
                                                       I2b2 Web       I2b2
                                                         Server
                                                                    Database

                                       Investigator                                          Metadata
                                                                                             Manager

                                                                                                                 Data Modeler


                                                                                  Data              Query
                                                                               Processing        Specification


                                                                                                                       Data Analyst
                                       Investigator

                                                                                             Database
                                                                                              Mapper


                                                                                                                  Data Analyst
                                                                     Study-
                                                      Query tools   specific
                                                                    Database    Source       Source        Source
                                       Investigator
                                                                                 data         data          data
5-year Datasets from Emory and
                                       University Healthcare Consortium
Center for Comprehensive Informatics




                                       • EUH, EUHM and WW (inpatient encounters)
                                       • Removed encounter pairs with chemotherapy and radiation
                                         therapy readmit encounters (CDW data)

                                       •   Encounter location (down to unit for Emory)
                                       •   Providers (Emory only)
                                       •   Discharge disposition
                                       •   Primary and secondary ICD9 codes
                                       •   Procedure codes
                                       •   DRGs
                                       •   Medication orders (Emory only)
                                       •   Labs (Emory only)
                                       •   Vitals (Emory only)
                                       •   Geographic information (CDW only + US Census and American
                                           Community Survey)
                                                          Analytic Information
Using Emory & UHC Data to Find
                                       Associations With 30-day Readmits
Center for Comprehensive Informatics




                                       • Problem: “Raw” clinical and administrative variables
                                         are difficult to use for associative data mining
                                          – Too many diagnosis codes, procedure codes
                                          – Continuous variables (e.g., labs) require interpretation
                                          – Temporal relationships between variables are implicit
                                       • Solution: Transform the data into a much smaller set
                                         of variables using heuristic knowledge
                                          – Categorize diagnosis and procedure codes using code
                                            hierarchies
                                          – Classify continuous variables using standard
                                            interpretations (e.g., high, normal, low)
                                          – Identify temporal patterns (e.g., frequency, duration,
                                            sequence)
                                          – Apply standard data mining techniques

                                                            Analytic Information
Derived Variables
Center for Comprehensive Informatics



                                 •     30-day readmit
                                 •     The 9 Emory Enhanced Risk Assessment Tool diagnosis categories
                                 •     UHC product lines
                                 •     Variables derived from a combination of codes and/or laboratory test results
                                        – Obesity
                                        – Diabetes/uncontrolled diabetes
                                        – End-stage renal disease (ESRD)
                                        – Pressure ulcer
                                        – Sickle cell disease/sickle cell crisis
                                 •     Temporal variables derived over multiple encounters
                                        – Multiple MI
                                        – Multiple 30-day readmissions
                                        – Chemotherapy within 180 (or 365) days before surgery
                                        – Previous encounter within the last 90 (or 180) days
30-Day Readmission Rates for
                                       Derived Variables
Center for Comprehensive Informatics



                                       Emory Health Care
Geographic Analyses
                                       UHC Medicine General Product Line (#15)
Center for Comprehensive Informatics




                                                         Analytic Information Warehouse
Predictive Modeling for Readmission
Center for Comprehensive Informatics




                                       • Random forests (ensemble of decision trees)
                                         – Create a decision tree using a random subset of the
                                           variables in the dataset
                                         – Generate a large number of such trees
                                         – All trees vote to classify each test example in a
                                           training dataset
                                         – Generate a patient-specific readmission risk for each
                                           encounter
                                       • Rank the encounters by risk for a subsequent 30-
                                         day readmission



                                                         Analytic Information
Emory Readmission Rates for High and
                                       Low Risk Groups Generated with
Center for Comprehensive Informatics




                                       Random Forest
Predictive Modeling Applied to 180 UHC Hospitals
                                       Readmission fraction of top 10% high risk patients
Center for Comprehensive Informatics



                                        0.9


                                        0.8


                                        0.7


                                        0.6


                                        0.5                                                 All Hospital Model

                                                                                            Individual Hospital
                                        0.4
                                                                                            Model
                                        0.3


                                        0.2


                                        0.1


                                         0
                                              113
                                               17
                                               25
                                               33
                                               41
                                               49
                                               57
                                               65
                                               73
                                               81
                                               89
                                               97




                                              161
                                              105

                                              121
                                              129
                                              137
                                              145
                                              153

                                              169
                                              177
                                              185
                                                9
                                                1
Status of Healthcare Data Analytics
Center for Comprehensive Informatics




                                       • Integrative dataset analysis can leverage patient
                                         information gathered over many encounters
                                       • Temporal analyses can generate derived variables that
                                         appear to correlate with readmissions
                                       • Predictive modeling has promise of providing decision
                                         support
                                       • Data Analytics arm of the Emory New Care Model
                                         Initiative led by Greg Esper
                                       • Ongoing analyses involve characterization of clinical
                                         phenotype in GWAS, biomarker and quality
                                         improvement efforts
                                       • Co-lead (with Bill Hersh) of CTSA CER Informatics
                                         taskforce dedicated to this issue
Center for Comprehensive Informatics




   DATA COMPUTING
   HIGH END AND LARGE
Supercomputing – Collaboration with ORNL: Titan – Peak Speed
                                       30,000,000,000,000,000 floating point operations per second!
Center for Comprehensive Informatics
Core Transformations for multi-scale pipelines
Center for Comprehensive Informatics




                                       • Data Cleaning and Low Level Transformations
                                       • Data Subsetting, Filtering, Subsampling
                                       • Spatio-temporal Mapping and Registration
                                       • Object Segmentation
                                       • Feature Extraction, Object Classification
                                       • Spatio-temporal Aggregation
                                       • Change Detection, Comparison, and Quantification
Extreme DataCutter – Two Level Model
Center for Comprehensive Informatics
Center for Comprehensive Informatics

                                       Node Level Work Scheduling
VLDB 2012
Center for Comprehensive Informatics




                                       Change Detection, Comparison, and Quantification
Thanks!

More Related Content

Viewers also liked

Integrative Multi-Scale Analyses
Integrative Multi-Scale AnalysesIntegrative Multi-Scale Analyses
Integrative Multi-Scale AnalysesJoel Saltz
 
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...Joel Saltz
 
Tools to Analyze Morphology and Spatially Mapped Molecular Data - Informatio...
Tools to Analyze Morphology and Spatially Mapped Molecular Data -  Informatio...Tools to Analyze Morphology and Spatially Mapped Molecular Data -  Informatio...
Tools to Analyze Morphology and Spatially Mapped Molecular Data - Informatio...Joel Saltz
 
Generation and Use of Quantitative Pathology Phenotype
Generation and Use of Quantitative Pathology PhenotypeGeneration and Use of Quantitative Pathology Phenotype
Generation and Use of Quantitative Pathology PhenotypeJoel Saltz
 
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...Joel Saltz
 
Machine Learning and Deep Contemplation of Data
Machine Learning and Deep Contemplation of DataMachine Learning and Deep Contemplation of Data
Machine Learning and Deep Contemplation of DataJoel Saltz
 

Viewers also liked (6)

Integrative Multi-Scale Analyses
Integrative Multi-Scale AnalysesIntegrative Multi-Scale Analyses
Integrative Multi-Scale Analyses
 
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...
Integrative Multi-Scale Analysis in Biomedical Data Science: Tools, Methods a...
 
Tools to Analyze Morphology and Spatially Mapped Molecular Data - Informatio...
Tools to Analyze Morphology and Spatially Mapped Molecular Data -  Informatio...Tools to Analyze Morphology and Spatially Mapped Molecular Data -  Informatio...
Tools to Analyze Morphology and Spatially Mapped Molecular Data - Informatio...
 
Generation and Use of Quantitative Pathology Phenotype
Generation and Use of Quantitative Pathology PhenotypeGeneration and Use of Quantitative Pathology Phenotype
Generation and Use of Quantitative Pathology Phenotype
 
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
Spatio-­‐temporal Sensor Integration, Analysis, Classification or Can Exascal...
 
Machine Learning and Deep Contemplation of Data
Machine Learning and Deep Contemplation of DataMachine Learning and Deep Contemplation of Data
Machine Learning and Deep Contemplation of Data
 

Similar to Data and Computational Challenges in Integrative Biomedical Informatics

Extreme Spatio-Temporal Data Analysis
Extreme Spatio-Temporal Data AnalysisExtreme Spatio-Temporal Data Analysis
Extreme Spatio-Temporal Data AnalysisJoel Saltz
 
Qualifications And Experience Presentation
Qualifications And Experience PresentationQualifications And Experience Presentation
Qualifications And Experience PresentationKevin Baker
 
Maria de la Iglesia - CEIB: a R&D services in bioimaging oriented to integrat...
Maria de la Iglesia - CEIB: a R&D services in bioimaging oriented to integrat...Maria de la Iglesia - CEIB: a R&D services in bioimaging oriented to integrat...
Maria de la Iglesia - CEIB: a R&D services in bioimaging oriented to integrat...WTHS
 
Role of Biomedical Informatics in Translational Cancer Research
Role of Biomedical Informatics in Translational Cancer ResearchRole of Biomedical Informatics in Translational Cancer Research
Role of Biomedical Informatics in Translational Cancer ResearchJoel Saltz
 
MICCAI - Workshop on High Performance and Distributed Computing for Medical I...
MICCAI - Workshop on High Performance and Distributed Computing for Medical I...MICCAI - Workshop on High Performance and Distributed Computing for Medical I...
MICCAI - Workshop on High Performance and Distributed Computing for Medical I...Joel Saltz
 
High Dimensional Fused-Informatics
High Dimensional Fused-InformaticsHigh Dimensional Fused-Informatics
High Dimensional Fused-InformaticsJoel Saltz
 
Acs dispensing processes profoundly impact biological assays, computational ...
Acs  dispensing processes profoundly impact biological assays, computational ...Acs  dispensing processes profoundly impact biological assays, computational ...
Acs dispensing processes profoundly impact biological assays, computational ...Sean Ekins
 
BioDec Srl Company Profile
BioDec Srl Company ProfileBioDec Srl Company Profile
BioDec Srl Company ProfileBioDec
 
Semantic enrichment and similarity approximation for biomedical sequence images
Semantic enrichment and similarity approximation for biomedical sequence imagesSemantic enrichment and similarity approximation for biomedical sequence images
Semantic enrichment and similarity approximation for biomedical sequence imagesSyed Ahmad Chan Bukhari, PhD
 
Adam Margolin & Nicole DeFlaux Science Online London 2011-09-01
Adam Margolin & Nicole DeFlaux Science Online London 2011-09-01Adam Margolin & Nicole DeFlaux Science Online London 2011-09-01
Adam Margolin & Nicole DeFlaux Science Online London 2011-09-01Sage Base
 
Nasa
NasaNasa
Nasalusik
 

Similar to Data and Computational Challenges in Integrative Biomedical Informatics (14)

Extreme Spatio-Temporal Data Analysis
Extreme Spatio-Temporal Data AnalysisExtreme Spatio-Temporal Data Analysis
Extreme Spatio-Temporal Data Analysis
 
Dr. Lee Cooper: Integrated Morphologic Analysis for Identification and Charac...
Dr. Lee Cooper: Integrated Morphologic Analysis for Identification and Charac...Dr. Lee Cooper: Integrated Morphologic Analysis for Identification and Charac...
Dr. Lee Cooper: Integrated Morphologic Analysis for Identification and Charac...
 
Qualifications And Experience Presentation
Qualifications And Experience PresentationQualifications And Experience Presentation
Qualifications And Experience Presentation
 
Maria de la Iglesia - CEIB: a R&D services in bioimaging oriented to integrat...
Maria de la Iglesia - CEIB: a R&D services in bioimaging oriented to integrat...Maria de la Iglesia - CEIB: a R&D services in bioimaging oriented to integrat...
Maria de la Iglesia - CEIB: a R&D services in bioimaging oriented to integrat...
 
Role of Biomedical Informatics in Translational Cancer Research
Role of Biomedical Informatics in Translational Cancer ResearchRole of Biomedical Informatics in Translational Cancer Research
Role of Biomedical Informatics in Translational Cancer Research
 
MICCAI - Workshop on High Performance and Distributed Computing for Medical I...
MICCAI - Workshop on High Performance and Distributed Computing for Medical I...MICCAI - Workshop on High Performance and Distributed Computing for Medical I...
MICCAI - Workshop on High Performance and Distributed Computing for Medical I...
 
High Dimensional Fused-Informatics
High Dimensional Fused-InformaticsHigh Dimensional Fused-Informatics
High Dimensional Fused-Informatics
 
Acs dispensing processes profoundly impact biological assays, computational ...
Acs  dispensing processes profoundly impact biological assays, computational ...Acs  dispensing processes profoundly impact biological assays, computational ...
Acs dispensing processes profoundly impact biological assays, computational ...
 
D03-NextGen-Bio-NGS
D03-NextGen-Bio-NGSD03-NextGen-Bio-NGS
D03-NextGen-Bio-NGS
 
BioDec Srl Company Profile
BioDec Srl Company ProfileBioDec Srl Company Profile
BioDec Srl Company Profile
 
Brizio rossibiodec
Brizio rossibiodecBrizio rossibiodec
Brizio rossibiodec
 
Semantic enrichment and similarity approximation for biomedical sequence images
Semantic enrichment and similarity approximation for biomedical sequence imagesSemantic enrichment and similarity approximation for biomedical sequence images
Semantic enrichment and similarity approximation for biomedical sequence images
 
Adam Margolin & Nicole DeFlaux Science Online London 2011-09-01
Adam Margolin & Nicole DeFlaux Science Online London 2011-09-01Adam Margolin & Nicole DeFlaux Science Online London 2011-09-01
Adam Margolin & Nicole DeFlaux Science Online London 2011-09-01
 
Nasa
NasaNasa
Nasa
 

More from Joel Saltz

AI and whole slide imaging biomarkers
AI and whole slide imaging biomarkersAI and whole slide imaging biomarkers
AI and whole slide imaging biomarkersJoel Saltz
 
Pathomics, Clinical Studies, and Cancer Surveillance
Pathomics, Clinical Studies, and Cancer SurveillancePathomics, Clinical Studies, and Cancer Surveillance
Pathomics, Clinical Studies, and Cancer SurveillanceJoel Saltz
 
Learning, Training,  Classification,  Common Sense and Exascale Computing
Learning, Training,  Classification,  Common Sense and Exascale ComputingLearning, Training,  Classification,  Common Sense and Exascale Computing
Learning, Training,  Classification,  Common Sense and Exascale ComputingJoel Saltz
 
Integrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming DataIntegrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming DataJoel Saltz
 
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...Joel Saltz
 
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure CancerExtreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure CancerJoel Saltz
 
Twenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeTwenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeJoel Saltz
 
Twenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeTwenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeJoel Saltz
 
Digital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineDigital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineJoel Saltz
 
Pathomics Based Biomarkers and Precision Medicine
Pathomics Based Biomarkers and Precision MedicinePathomics Based Biomarkers and Precision Medicine
Pathomics Based Biomarkers and Precision MedicineJoel Saltz
 
Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014Joel Saltz
 
Exascale Computing and Experimental Sensor Data
Exascale Computing and Experimental Sensor DataExascale Computing and Experimental Sensor Data
Exascale Computing and Experimental Sensor DataJoel Saltz
 
Exascale Challenges: Space, Time, Experimental Science and Self Driving Cars
Exascale Challenges: Space, Time, Experimental Science and Self Driving Cars Exascale Challenges: Space, Time, Experimental Science and Self Driving Cars
Exascale Challenges: Space, Time, Experimental Science and Self Driving Cars Joel Saltz
 
Data Science, Big Data and You
Data Science, Big Data and YouData Science, Big Data and You
Data Science, Big Data and YouJoel Saltz
 
Presentation at UHC Annual Meeting
Presentation at UHC  Annual MeetingPresentation at UHC  Annual Meeting
Presentation at UHC Annual MeetingJoel Saltz
 
Indiana 4 2011 Final Final
Indiana 4 2011 Final FinalIndiana 4 2011 Final Final
Indiana 4 2011 Final FinalJoel Saltz
 
Actsi bip overview jan 2011
Actsi bip overview jan 2011Actsi bip overview jan 2011
Actsi bip overview jan 2011Joel Saltz
 

More from Joel Saltz (17)

AI and whole slide imaging biomarkers
AI and whole slide imaging biomarkersAI and whole slide imaging biomarkers
AI and whole slide imaging biomarkers
 
Pathomics, Clinical Studies, and Cancer Surveillance
Pathomics, Clinical Studies, and Cancer SurveillancePathomics, Clinical Studies, and Cancer Surveillance
Pathomics, Clinical Studies, and Cancer Surveillance
 
Learning, Training,  Classification,  Common Sense and Exascale Computing
Learning, Training,  Classification,  Common Sense and Exascale ComputingLearning, Training,  Classification,  Common Sense and Exascale Computing
Learning, Training,  Classification,  Common Sense and Exascale Computing
 
Integrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming DataIntegrative Everything, Deep Learning and Streaming Data
Integrative Everything, Deep Learning and Streaming Data
 
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
Digital Pathology: Precision Medicine, Deep Learning and Computer Aided Inter...
 
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure CancerExtreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
Extreme Computing, Clinical Medicine and GPUs or Can GPUs Cure Cancer
 
Twenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeTwenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase Change
 
Twenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase ChangeTwenty Years of Whole Slide Imaging - the Coming Phase Change
Twenty Years of Whole Slide Imaging - the Coming Phase Change
 
Digital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineDigital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision Medicine
 
Pathomics Based Biomarkers and Precision Medicine
Pathomics Based Biomarkers and Precision MedicinePathomics Based Biomarkers and Precision Medicine
Pathomics Based Biomarkers and Precision Medicine
 
Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014
 
Exascale Computing and Experimental Sensor Data
Exascale Computing and Experimental Sensor DataExascale Computing and Experimental Sensor Data
Exascale Computing and Experimental Sensor Data
 
Exascale Challenges: Space, Time, Experimental Science and Self Driving Cars
Exascale Challenges: Space, Time, Experimental Science and Self Driving Cars Exascale Challenges: Space, Time, Experimental Science and Self Driving Cars
Exascale Challenges: Space, Time, Experimental Science and Self Driving Cars
 
Data Science, Big Data and You
Data Science, Big Data and YouData Science, Big Data and You
Data Science, Big Data and You
 
Presentation at UHC Annual Meeting
Presentation at UHC  Annual MeetingPresentation at UHC  Annual Meeting
Presentation at UHC Annual Meeting
 
Indiana 4 2011 Final Final
Indiana 4 2011 Final FinalIndiana 4 2011 Final Final
Indiana 4 2011 Final Final
 
Actsi bip overview jan 2011
Actsi bip overview jan 2011Actsi bip overview jan 2011
Actsi bip overview jan 2011
 

Recently uploaded

Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment BookingCall Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Bookingnarwatsonia7
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...narwatsonia7
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...narwatsonia7
 
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service JaipurHigh Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipurparulsinha
 
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service BangaloreCall Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalorenarwatsonia7
 
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingCall Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingNehru place Escorts
 
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
See the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformSee the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformKweku Zurek
 
Call Girls Viman Nagar 7001305949 All Area Service COD available Any Time
Call Girls Viman Nagar 7001305949 All Area Service COD available Any TimeCall Girls Viman Nagar 7001305949 All Area Service COD available Any Time
Call Girls Viman Nagar 7001305949 All Area Service COD available Any Timevijaych2041
 
Hematology and Immunology - Leukocytes Functions
Hematology and Immunology - Leukocytes FunctionsHematology and Immunology - Leukocytes Functions
Hematology and Immunology - Leukocytes FunctionsMedicoseAcademics
 
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...narwatsonia7
 
Glomerular Filtration and determinants of glomerular filtration .pptx
Glomerular Filtration and  determinants of glomerular filtration .pptxGlomerular Filtration and  determinants of glomerular filtration .pptx
Glomerular Filtration and determinants of glomerular filtration .pptxDr.Nusrat Tariq
 
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...narwatsonia7
 
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls ServiceCall Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Servicesonalikaur4
 
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...narwatsonia7
 
Call Girls Service Noida Maya 9711199012 Independent Escort Service Noida
Call Girls Service Noida Maya 9711199012 Independent Escort Service NoidaCall Girls Service Noida Maya 9711199012 Independent Escort Service Noida
Call Girls Service Noida Maya 9711199012 Independent Escort Service NoidaPooja Gupta
 
97111 47426 Call Girls In Delhi MUNIRKAA
97111 47426 Call Girls In Delhi MUNIRKAA97111 47426 Call Girls In Delhi MUNIRKAA
97111 47426 Call Girls In Delhi MUNIRKAAjennyeacort
 
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 

Recently uploaded (20)

Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment BookingCall Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
 
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service JaipurHigh Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
High Profile Call Girls Jaipur Vani 8445551418 Independent Escort Service Jaipur
 
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
 
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service BangaloreCall Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
Call Girl Bangalore Nandini 7001305949 Independent Escort Service Bangalore
 
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingCall Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
 
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Whitefield Just Call 7001305949 Top Class Call Girl Service Available
 
See the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformSee the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy Platform
 
Call Girls Viman Nagar 7001305949 All Area Service COD available Any Time
Call Girls Viman Nagar 7001305949 All Area Service COD available Any TimeCall Girls Viman Nagar 7001305949 All Area Service COD available Any Time
Call Girls Viman Nagar 7001305949 All Area Service COD available Any Time
 
Hematology and Immunology - Leukocytes Functions
Hematology and Immunology - Leukocytes FunctionsHematology and Immunology - Leukocytes Functions
Hematology and Immunology - Leukocytes Functions
 
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
Call Girls Frazer Town Just Call 7001305949 Top Class Call Girl Service Avail...
 
Glomerular Filtration and determinants of glomerular filtration .pptx
Glomerular Filtration and  determinants of glomerular filtration .pptxGlomerular Filtration and  determinants of glomerular filtration .pptx
Glomerular Filtration and determinants of glomerular filtration .pptx
 
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
 
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
 
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls ServiceCall Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
 
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
 
Call Girls Service Noida Maya 9711199012 Independent Escort Service Noida
Call Girls Service Noida Maya 9711199012 Independent Escort Service NoidaCall Girls Service Noida Maya 9711199012 Independent Escort Service Noida
Call Girls Service Noida Maya 9711199012 Independent Escort Service Noida
 
97111 47426 Call Girls In Delhi MUNIRKAA
97111 47426 Call Girls In Delhi MUNIRKAA97111 47426 Call Girls In Delhi MUNIRKAA
97111 47426 Call Girls In Delhi MUNIRKAA
 
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
 

Data and Computational Challenges in Integrative Biomedical Informatics

  • 1. Data and Computational Challenges in Integrative Biomedical Informatics Joel Saltz MD, PhD Chair Department of Biomedical Informatics, Director Center for Comprehensive Informatics Emory University Adjunct Professor CSE, CS College of Computing, Georgia Tech
  • 2. Center for Comprehensive Informatics ANALYTICS INTEGRATIVE DATA
  • 3. Integrative Biomedical Informatics Analytics Center for Comprehensive Informatics • Anatomic/functional characterization at fine level (Pathology) and gross level (Radiology) Radiology Imaging • High throughput multi- scale image segmentation, feature Patient “Omic” extraction, analysis of Outcome Data features • Integration of anatomic/functional Pathologic Features characterization with multiple types of “omic” information
  • 4. Integrative Spatio-Temporal Molecular Analytics Center for Comprehensive Informatics • Aka Big Data
  • 5. Quantitative Feature Analysis in Pathology: Emory In Silico Center for Brain Tumor Research (PI = Dan Brat, PD= Joel Saltz)
  • 6. Using TCGA Data to Study Glioblastoma Diagnostic Improvement Molecular Classification Predictors of Progression
  • 7. Millions of Nuclei Defined by n Features • Top-down analysis: use the features with existing diagnostic constructs • Bottom-up analysis: let features define and drive the analysis
  • 8. TCGA Whole Slide Images Step 1: Nuclei • Identify individual nuclei Segmentation and their boundaries Jun Kong
  • 9. Nuclear Analysis Workflow Step 1: Step 2: Nuclei Feature Segmentation Extraction • Describe individual nuclei in terms of size, shape, and texture
  • 10. Step 3: Nuclei Nuclear Qualities Classification 1 10 Oligodendroglioma Astrocytoma
  • 11. Comparison of Machine-based Classification to Human Based Classification Separation of GBM, Oligo1, Oligo2 Separation of GBM, Oligo1 and as Designated by Oligo2 as Designated by Machine Neuropathologists
  • 13. Gene Expression Correlates of High Oligo-Astro Ratio on Machine-based Classification Oligo Related Genes Myelin Basic Protein Proteolipoprotein HoxD1 Nuclear features most Associated with Oligo Signature Genes: Circularity (high) Eccentricity (low)
  • 14. Millions of Nuclei Defined by n Features • Top-down analysis: analyze features in context of existing diagnostic constructs • Bottom-up analysis: let nuclear features define and drive the analysis
  • 15. Direct Study of Relationship Between vs Center for Comprehensive Informatics Lee Cooper, Carlos Moreno
  • 16. Clustering identifies three morphological groups Center for Comprehensive Informatics • Analyzed 200 million nuclei from 162 TCGA GBMs (462 slides) • Named for functions of associated genes: Cell Cycle (CC), Chromatin Modification (CM), Protein Biosynthesis (PB) • Prognostically-significant (logrank p=4.5e-4) CC CM PB 1 CC 10 0.8 CM PB 20 Feature Indices 0.6 Survival 30 0.4 40 0.2 50 0 0 500 1000 1500 2000 2500 3000 Days
  • 17. Center for Comprehensive Informatics Associations
  • 18. Center for Comprehensive Informatics ANALYTICS HEALTHCARE DATA
  • 19. Clinical Phenotype Characterization and the Emory Analytic Information Warehouse Center for Comprehensive Informatics • Example Project: Find hot spots in readmissions within 30 days – What fraction of patients with a given principal diagnosis will be readmitted within 30 days? – What fraction of patients with a given set of diseases will be readmitted within 30 days? – How does severity and time course of co-morbidities affect readmissions? – Geographic analyses • Compare and contrast with UHC Clinical Data Base – Repeat analyses across all UHC hospitals – Are we performing the same? – How are UHC-curated groupings of patients (e.g., product lines) useful? • Need a repeatable process that we can apply identically to both local and UHC data
  • 20. Overall System Center for Comprehensive Informatics Metadata Repository I2b2 Web I2b2 Server Database Investigator Metadata Manager Data Modeler Data Query Processing Specification Data Analyst Investigator Database Mapper Data Analyst Study- Query tools specific Database Source Source Source Investigator data data data
  • 21. 5-year Datasets from Emory and University Healthcare Consortium Center for Comprehensive Informatics • EUH, EUHM and WW (inpatient encounters) • Removed encounter pairs with chemotherapy and radiation therapy readmit encounters (CDW data) • Encounter location (down to unit for Emory) • Providers (Emory only) • Discharge disposition • Primary and secondary ICD9 codes • Procedure codes • DRGs • Medication orders (Emory only) • Labs (Emory only) • Vitals (Emory only) • Geographic information (CDW only + US Census and American Community Survey) Analytic Information
  • 22. Using Emory & UHC Data to Find Associations With 30-day Readmits Center for Comprehensive Informatics • Problem: “Raw” clinical and administrative variables are difficult to use for associative data mining – Too many diagnosis codes, procedure codes – Continuous variables (e.g., labs) require interpretation – Temporal relationships between variables are implicit • Solution: Transform the data into a much smaller set of variables using heuristic knowledge – Categorize diagnosis and procedure codes using code hierarchies – Classify continuous variables using standard interpretations (e.g., high, normal, low) – Identify temporal patterns (e.g., frequency, duration, sequence) – Apply standard data mining techniques Analytic Information
  • 23. Derived Variables Center for Comprehensive Informatics • 30-day readmit • The 9 Emory Enhanced Risk Assessment Tool diagnosis categories • UHC product lines • Variables derived from a combination of codes and/or laboratory test results – Obesity – Diabetes/uncontrolled diabetes – End-stage renal disease (ESRD) – Pressure ulcer – Sickle cell disease/sickle cell crisis • Temporal variables derived over multiple encounters – Multiple MI – Multiple 30-day readmissions – Chemotherapy within 180 (or 365) days before surgery – Previous encounter within the last 90 (or 180) days
  • 24. 30-Day Readmission Rates for Derived Variables Center for Comprehensive Informatics Emory Health Care
  • 25. Geographic Analyses UHC Medicine General Product Line (#15) Center for Comprehensive Informatics Analytic Information Warehouse
  • 26. Predictive Modeling for Readmission Center for Comprehensive Informatics • Random forests (ensemble of decision trees) – Create a decision tree using a random subset of the variables in the dataset – Generate a large number of such trees – All trees vote to classify each test example in a training dataset – Generate a patient-specific readmission risk for each encounter • Rank the encounters by risk for a subsequent 30- day readmission Analytic Information
  • 27. Emory Readmission Rates for High and Low Risk Groups Generated with Center for Comprehensive Informatics Random Forest
  • 28. Predictive Modeling Applied to 180 UHC Hospitals Readmission fraction of top 10% high risk patients Center for Comprehensive Informatics 0.9 0.8 0.7 0.6 0.5 All Hospital Model Individual Hospital 0.4 Model 0.3 0.2 0.1 0 113 17 25 33 41 49 57 65 73 81 89 97 161 105 121 129 137 145 153 169 177 185 9 1
  • 29. Status of Healthcare Data Analytics Center for Comprehensive Informatics • Integrative dataset analysis can leverage patient information gathered over many encounters • Temporal analyses can generate derived variables that appear to correlate with readmissions • Predictive modeling has promise of providing decision support • Data Analytics arm of the Emory New Care Model Initiative led by Greg Esper • Ongoing analyses involve characterization of clinical phenotype in GWAS, biomarker and quality improvement efforts • Co-lead (with Bill Hersh) of CTSA CER Informatics taskforce dedicated to this issue
  • 30. Center for Comprehensive Informatics DATA COMPUTING HIGH END AND LARGE
  • 31. Supercomputing – Collaboration with ORNL: Titan – Peak Speed 30,000,000,000,000,000 floating point operations per second! Center for Comprehensive Informatics
  • 32.
  • 33. Core Transformations for multi-scale pipelines Center for Comprehensive Informatics • Data Cleaning and Low Level Transformations • Data Subsetting, Filtering, Subsampling • Spatio-temporal Mapping and Registration • Object Segmentation • Feature Extraction, Object Classification • Spatio-temporal Aggregation • Change Detection, Comparison, and Quantification
  • 34. Extreme DataCutter – Two Level Model Center for Comprehensive Informatics
  • 35. Center for Comprehensive Informatics Node Level Work Scheduling
  • 36. VLDB 2012 Center for Comprehensive Informatics Change Detection, Comparison, and Quantification

Editor's Notes

  1. Combine with next slide.Graphical representation