SlideShare una empresa de Scribd logo
1 de 25
Descargar para leer sin conexión
Causal Inference and Explanation
to Improve Human Health
Samantha Kleinberg
Stevens Institute of Technology
Can we learn..
risk factors for heart failure
causes of secondary brain injury
what leads to an individual’s hypoglycemic
episodes
from observing people?
Why worry about causality?
• Robust predictions
• Coherent explanations
• Actionable information
Why observational data?
• Experiments often infeasible, unethical, or too
expensive
• Routinely collected in many situations
• Electronic health records in hospitals
• ICU data streams
• Body-worn sensors and mobile devices
This talk
• We can make some progress in getting causes
from medical data
• But should worry about missing data and
nonstationarity
• Explanation can be automated (sometimes)
Logic-based causal
inference
• Complex, temporal relationships
• Assess average difference cause makes to
probability of effect
Kleinberg, S. (2012) Causality, Probability, and Time. Cambridge University Press.
Kleinberg, S. (2015) Why: A Guide to Finding and Using Causes. O’Reilly Media
• Main idea: looking for better explanations for the
effect
• Inferring timing
• Instead of accepting/rejecting hypotheses, refine
them from data using greedy search
• Can start by testing relationships between all
variables and CHF in 1-2 weeks, and ultimately
infer "high AST leads to CHF in 4-10 days”
Application: understanding
stroke recovery
Massive amounts of data collected in ICU (>100,000
measurements per person)
How do patients change over time?
NICU dataset
• 98 patients with subarachnoid hemorrhage
• Monitoring included
• Depth and surface EEG
• Microdialysis
• Physiologic measurements
(no data on procedures)
Lots of data, but lots of missing
data
• Device malfunctions
• Device connected to perform a procedure
• Different monitors started at different times
• Different recording frequencies
TW% BrT CI CO2EX CPP CVP ELWI GEDI GLU GLU2panel HR ICP
0.56001433 0.55845172 0.45439822 0.77027571 0.89076666 0.80927443 0.47893916 0.47893916 0.07274971 0.07943021 0.98646971 0.92582606
0.69029286 0.71585128 0.67012404 0.80326783 0.91217418 0.84294797 0.56494303 0.56494303 0.13405794 0.03836355 0.98382787 0.94543471
56.0014325 55.845172 45.4398223 77.0275708 89.0766656 80.9274429 47.893916 47.893916 7.27497093 7.94302149 98.6469713 92.5826062
69.0292858 71.5851282 67.0124038 80.3267833 91.2174176 84.294797 56.4943028 56.4943028 13.4057939 3.83635458 98.3827871 94.5434715
0D
20D
40D
60D
80D
100D
TW%D
BrTD
CID
CO2EXD
CPPD
CVPD
ELWID
GEDID
GLUD
GLU2panelD
HRD
ICPD
LACD
LGRD
LPRD
MAPD
MVD
PEEPD
PGRD
PYRD
PbtO2D
RRD
SPO2%D
SVVD
SvO2D
TMPD
rCBFD
%"data"recorded"per"
variable"
0296hrsD
962168hrsD
And…
• All variables may be missing at once (if measured
by single device)
• Can’t use imputation methods that assume
some present values
• Missing values depend on variable (NMAR) + other
variables (MAR)
• e.g. BG depends on itself, as well as insulin
• Variables are correlated across time
Imputation w/lagged
correlations
S. A. Rahman, Y. Huang, J. Claassen, N. Heintzman, and S. Kleinberg. Combining Fourier and
Lagged k-Nearest Neighbor Imputation for Biomedical Time Series Data. Journal of Biomedical
Informatics (2015)
https://github.com/kleinberg-lab/FLK-NN
Evaluation
Regulation changes over
time
0-96hrs
96-168 hrs
Claassen J, Rahman SA, Huang Y, Frey H, Schmidt M, Albers D, Falo CM, Park S, Agarwal S, Connolly ES,
Kleinberg S (2015) Causal structure of brain physiology after brain injury. PLoS ONE.
Explanation
• Why is an individual’s brain swelling at a particular
time?
• What caused a specific stock market crash?
• Who is at fault for a car accident?
• Inference finds type-level relationships







• Explanation tells us why specific events happened







heart failure
uncontrolled
diabetes
thyroid
disfunction
heart failure
uncontrolled
diabetes
thyroid
disfunction
• Goals for explanation
• Find causes of specific events automatically (no
human in the loop)
• Find causes of when, whether and how events
occur
• Approach: simulation to answer counterfactual
queries
C. Merck and S. Kleinberg. Causal explanation under indeterminism:
A sampling approach. AAAI, 2016.
https://github.com/kleinberg-lab/stoch_cf
Counterfactual vs. Actual
Distributions
• P(•|¬A) is the counterfactual
distribution of A



• P(•|A) is the actual distribution
of A

Three Types of Explanation
B because of A iff

P (B|A) >> P (B|¬A)
B hastened by A iff

E[tB |A] << E[tB |¬A]
B intensified by A iff

E[mB |A] >> E[mB |¬A] 

B despite A iff

P (B|A) << P (B|¬A)
B delayed by A iff

E[tB |A] >> E[tB |¬A]
B attenuated by A iff

E[mB |A] << E[mB |¬A] 

tB = time of B occurring
mB = intensity (manner) of B occurring
probability:

timing:

intensity:
Causal Chain
with Billiard Balls
Hastening
• 6->8, then 8->P
• but 7->8 is a more reliable backup
• probability raising finds

“8->P despite 6->8”
• but by analyzing timing we find

“6->8 hastened 8->P”
Diabetes Simulation
• Run or Lunch alone would not
have caused Hypoglycemia

(see counterfactual dists)
• Yet together they explain the
Hypoglycemia

(see actual distribution)
• We see beyond the most
recent event (Lunch)
• We can measure quantitative
strength of effect in mg/dL:

E[ glu | R] - E[ glu | ¬R]
Looking forward: personalized
and pervasive sensing
• Can we develop a fitbit for nutrition?
• Find effect of food on glucose, causes of
overeating
Key open problems
• Data with uncertain timings
• Finding the “right” variables
• Explanation beyond models
Thanks!
Teams @CUMC, Stevens
NSF, NIH, JSMF
And my group is hiring!

Más contenido relacionado

Similar a Samantha Kleinberg, Assistant Professor of Computer Science, Stevens Institute of Technology at MLconf NYC - 4/15/16

Smoking and Pregnancy SurveyPlease take this brief survey of w.docx
Smoking and Pregnancy SurveyPlease take this brief survey of w.docxSmoking and Pregnancy SurveyPlease take this brief survey of w.docx
Smoking and Pregnancy SurveyPlease take this brief survey of w.docxpbilly1
 
ADAPT: Analysis of Dynamic Adaptations in Parameter Trajectories
ADAPT: Analysis of Dynamic Adaptations in Parameter Trajectories ADAPT: Analysis of Dynamic Adaptations in Parameter Trajectories
ADAPT: Analysis of Dynamic Adaptations in Parameter Trajectories Natal van Riel
 
BPM19 - trace clustering on very large event data
BPM19 - trace clustering on very large event dataBPM19 - trace clustering on very large event data
BPM19 - trace clustering on very large event dataXixi Lu
 
Chapter 3 Experimental Errors Statistics
Chapter 3 Experimental Errors  StatisticsChapter 3 Experimental Errors  Statistics
Chapter 3 Experimental Errors Statisticswanafif8
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceSean Byrnes
 
A fast Algorithm for Automatic Segmentation of Pancreas Histological Images f...
A fast Algorithm for Automatic Segmentation of Pancreas Histological Images f...A fast Algorithm for Automatic Segmentation of Pancreas Histological Images f...
A fast Algorithm for Automatic Segmentation of Pancreas Histological Images f...Tathagata Bandyopadhyay
 
An introduction to the stepped wedge cluster randomised trial
An introduction to the stepped wedge cluster randomised trial An introduction to the stepped wedge cluster randomised trial
An introduction to the stepped wedge cluster randomised trial Karla hemming
 
Imputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsImputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsNitin George
 
To infinity and beyond
To infinity and beyond To infinity and beyond
To infinity and beyond Stephen Senn
 
Robots, Small Molecules & R
Robots, Small Molecules & RRobots, Small Molecules & R
Robots, Small Molecules & RRajarshi Guha
 
Optimized Design of Broadly Detecting qPCR Primers and Probes Using a Conserv...
Optimized Design of Broadly Detecting qPCR Primers and Probes Using a Conserv...Optimized Design of Broadly Detecting qPCR Primers and Probes Using a Conserv...
Optimized Design of Broadly Detecting qPCR Primers and Probes Using a Conserv...Kate Barlow
 
The Analytics Opportunity in Healthcare
The Analytics Opportunity in HealthcareThe Analytics Opportunity in Healthcare
The Analytics Opportunity in HealthcareDATA360US
 
A Statistical Approach to Optimize Parameters for Electrodeposition of Indium...
A Statistical Approach to Optimize Parameters for Electrodeposition of Indium...A Statistical Approach to Optimize Parameters for Electrodeposition of Indium...
A Statistical Approach to Optimize Parameters for Electrodeposition of Indium...Arkansas State University
 
Prediction of Arterial Blood Gases(ABG) by Using Neural Network In Trauma Pat...
Prediction of Arterial Blood Gases(ABG) by Using Neural Network In Trauma Pat...Prediction of Arterial Blood Gases(ABG) by Using Neural Network In Trauma Pat...
Prediction of Arterial Blood Gases(ABG) by Using Neural Network In Trauma Pat...Mohammad Sabouri
 
Evaluation of Variability and Combinability of Fecal Calprotectin (FCP) Resul...
Evaluation of Variability and Combinability of Fecal Calprotectin (FCP) Resul...Evaluation of Variability and Combinability of Fecal Calprotectin (FCP) Resul...
Evaluation of Variability and Combinability of Fecal Calprotectin (FCP) Resul...Covance
 

Similar a Samantha Kleinberg, Assistant Professor of Computer Science, Stevens Institute of Technology at MLconf NYC - 4/15/16 (20)

Smoking and Pregnancy SurveyPlease take this brief survey of w.docx
Smoking and Pregnancy SurveyPlease take this brief survey of w.docxSmoking and Pregnancy SurveyPlease take this brief survey of w.docx
Smoking and Pregnancy SurveyPlease take this brief survey of w.docx
 
Chang Sha, China
Chang Sha, ChinaChang Sha, China
Chang Sha, China
 
ADAPT: Analysis of Dynamic Adaptations in Parameter Trajectories
ADAPT: Analysis of Dynamic Adaptations in Parameter Trajectories ADAPT: Analysis of Dynamic Adaptations in Parameter Trajectories
ADAPT: Analysis of Dynamic Adaptations in Parameter Trajectories
 
BPM19 - trace clustering on very large event data
BPM19 - trace clustering on very large event dataBPM19 - trace clustering on very large event data
BPM19 - trace clustering on very large event data
 
Statistics
StatisticsStatistics
Statistics
 
Chapter 3 Experimental Errors Statistics
Chapter 3 Experimental Errors  StatisticsChapter 3 Experimental Errors  Statistics
Chapter 3 Experimental Errors Statistics
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
A fast Algorithm for Automatic Segmentation of Pancreas Histological Images f...
A fast Algorithm for Automatic Segmentation of Pancreas Histological Images f...A fast Algorithm for Automatic Segmentation of Pancreas Histological Images f...
A fast Algorithm for Automatic Segmentation of Pancreas Histological Images f...
 
An introduction to the stepped wedge cluster randomised trial
An introduction to the stepped wedge cluster randomised trial An introduction to the stepped wedge cluster randomised trial
An introduction to the stepped wedge cluster randomised trial
 
Imputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trialsImputation techniques for missing data in clinical trials
Imputation techniques for missing data in clinical trials
 
To infinity and beyond
To infinity and beyond To infinity and beyond
To infinity and beyond
 
Robots, Small Molecules & R
Robots, Small Molecules & RRobots, Small Molecules & R
Robots, Small Molecules & R
 
Sampling techniques
Sampling techniquesSampling techniques
Sampling techniques
 
Optimized Design of Broadly Detecting qPCR Primers and Probes Using a Conserv...
Optimized Design of Broadly Detecting qPCR Primers and Probes Using a Conserv...Optimized Design of Broadly Detecting qPCR Primers and Probes Using a Conserv...
Optimized Design of Broadly Detecting qPCR Primers and Probes Using a Conserv...
 
The Analytics Opportunity in Healthcare
The Analytics Opportunity in HealthcareThe Analytics Opportunity in Healthcare
The Analytics Opportunity in Healthcare
 
Tim Pletcher Presentation
Tim Pletcher PresentationTim Pletcher Presentation
Tim Pletcher Presentation
 
Tim Pletcher Presentation
Tim Pletcher PresentationTim Pletcher Presentation
Tim Pletcher Presentation
 
A Statistical Approach to Optimize Parameters for Electrodeposition of Indium...
A Statistical Approach to Optimize Parameters for Electrodeposition of Indium...A Statistical Approach to Optimize Parameters for Electrodeposition of Indium...
A Statistical Approach to Optimize Parameters for Electrodeposition of Indium...
 
Prediction of Arterial Blood Gases(ABG) by Using Neural Network In Trauma Pat...
Prediction of Arterial Blood Gases(ABG) by Using Neural Network In Trauma Pat...Prediction of Arterial Blood Gases(ABG) by Using Neural Network In Trauma Pat...
Prediction of Arterial Blood Gases(ABG) by Using Neural Network In Trauma Pat...
 
Evaluation of Variability and Combinability of Fecal Calprotectin (FCP) Resul...
Evaluation of Variability and Combinability of Fecal Calprotectin (FCP) Resul...Evaluation of Variability and Combinability of Fecal Calprotectin (FCP) Resul...
Evaluation of Variability and Combinability of Fecal Calprotectin (FCP) Resul...
 

Más de MLconf

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...MLconf
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingMLconf
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...MLconf
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushMLconf
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceMLconf
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...MLconf
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...MLconf
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMLconf
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionMLconf
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLMLconf
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksMLconf
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...MLconf
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldMLconf
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...MLconf
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...MLconf
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...MLconf
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeMLconf
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...MLconf
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareMLconf
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesMLconf
 

Más de MLconf (20)

Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
Jamila Smith-Loud - Understanding Human Impact: Social and Equity Assessments...
 
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language UnderstandingTed Willke - The Brain’s Guide to Dealing with Context in Language Understanding
Ted Willke - The Brain’s Guide to Dealing with Context in Language Understanding
 
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
Justin Armstrong - Applying Computer Vision to Reduce Contamination in the Re...
 
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold RushIgor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
Igor Markov - Quantum Computing: a Treasure Hunt, not a Gold Rush
 
Josh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious ExperienceJosh Wills - Data Labeling as Religious Experience
Josh Wills - Data Labeling as Religious Experience
 
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
Vinay Prabhu - Project GaitNet: Ushering in the ImageNet moment for human Gai...
 
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
Jekaterina Novikova - Machine Learning Methods in Detecting Alzheimer’s Disea...
 
Meghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the CheapMeghana Ravikumar - Optimized Image Classification on the Cheap
Meghana Ravikumar - Optimized Image Classification on the Cheap
 
Noam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data CollectionNoam Finkelstein - The Importance of Modeling Data Collection
Noam Finkelstein - The Importance of Modeling Data Collection
 
June Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of MLJune Andrews - The Uncanny Valley of ML
June Andrews - The Uncanny Valley of ML
 
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection TasksSneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
Sneha Rajana - Deep Learning Architectures for Semantic Relation Detection Tasks
 
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
Anoop Deoras - Building an Incrementally Trained, Local Taste Aware, Global D...
 
Vito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI WorldVito Ostuni - The Voice: New Challenges in a Zero UI World
Vito Ostuni - The Voice: New Challenges in a Zero UI World
 
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
Anna choromanska - Data-driven Challenges in AI: Scale, Information Selection...
 
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
Janani Kalyanam - Machine Learning to Detect Illegal Online Sales of Prescrip...
 
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
Esperanza Lopez Aguilera - Using a Bayesian Neural Network in the Detection o...
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
Rishabh Mehrotra - Recommendations in a Marketplace: Personalizing Explainabl...
 
Soumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better SoftwareSoumith Chintala - Increasing the Impact of AI Through Better Software
Soumith Chintala - Increasing the Impact of AI Through Better Software
 
Roy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime ChangesRoy Lowrance - Predicting Bond Prices: Regime Changes
Roy Lowrance - Predicting Bond Prices: Regime Changes
 

Último

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Último (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Samantha Kleinberg, Assistant Professor of Computer Science, Stevens Institute of Technology at MLconf NYC - 4/15/16

  • 1. Causal Inference and Explanation to Improve Human Health Samantha Kleinberg Stevens Institute of Technology
  • 2. Can we learn.. risk factors for heart failure causes of secondary brain injury what leads to an individual’s hypoglycemic episodes from observing people?
  • 3. Why worry about causality? • Robust predictions • Coherent explanations • Actionable information
  • 4. Why observational data? • Experiments often infeasible, unethical, or too expensive • Routinely collected in many situations • Electronic health records in hospitals • ICU data streams • Body-worn sensors and mobile devices
  • 5. This talk • We can make some progress in getting causes from medical data • But should worry about missing data and nonstationarity • Explanation can be automated (sometimes)
  • 6. Logic-based causal inference • Complex, temporal relationships • Assess average difference cause makes to probability of effect Kleinberg, S. (2012) Causality, Probability, and Time. Cambridge University Press. Kleinberg, S. (2015) Why: A Guide to Finding and Using Causes. O’Reilly Media
  • 7. • Main idea: looking for better explanations for the effect • Inferring timing • Instead of accepting/rejecting hypotheses, refine them from data using greedy search • Can start by testing relationships between all variables and CHF in 1-2 weeks, and ultimately infer "high AST leads to CHF in 4-10 days”
  • 8. Application: understanding stroke recovery Massive amounts of data collected in ICU (>100,000 measurements per person) How do patients change over time?
  • 9. NICU dataset • 98 patients with subarachnoid hemorrhage • Monitoring included • Depth and surface EEG • Microdialysis • Physiologic measurements (no data on procedures)
  • 10. Lots of data, but lots of missing data • Device malfunctions • Device connected to perform a procedure • Different monitors started at different times • Different recording frequencies TW% BrT CI CO2EX CPP CVP ELWI GEDI GLU GLU2panel HR ICP 0.56001433 0.55845172 0.45439822 0.77027571 0.89076666 0.80927443 0.47893916 0.47893916 0.07274971 0.07943021 0.98646971 0.92582606 0.69029286 0.71585128 0.67012404 0.80326783 0.91217418 0.84294797 0.56494303 0.56494303 0.13405794 0.03836355 0.98382787 0.94543471 56.0014325 55.845172 45.4398223 77.0275708 89.0766656 80.9274429 47.893916 47.893916 7.27497093 7.94302149 98.6469713 92.5826062 69.0292858 71.5851282 67.0124038 80.3267833 91.2174176 84.294797 56.4943028 56.4943028 13.4057939 3.83635458 98.3827871 94.5434715 0D 20D 40D 60D 80D 100D TW%D BrTD CID CO2EXD CPPD CVPD ELWID GEDID GLUD GLU2panelD HRD ICPD LACD LGRD LPRD MAPD MVD PEEPD PGRD PYRD PbtO2D RRD SPO2%D SVVD SvO2D TMPD rCBFD %"data"recorded"per" variable" 0296hrsD 962168hrsD
  • 11. And… • All variables may be missing at once (if measured by single device) • Can’t use imputation methods that assume some present values • Missing values depend on variable (NMAR) + other variables (MAR) • e.g. BG depends on itself, as well as insulin • Variables are correlated across time
  • 12. Imputation w/lagged correlations S. A. Rahman, Y. Huang, J. Claassen, N. Heintzman, and S. Kleinberg. Combining Fourier and Lagged k-Nearest Neighbor Imputation for Biomedical Time Series Data. Journal of Biomedical Informatics (2015) https://github.com/kleinberg-lab/FLK-NN
  • 14. Regulation changes over time 0-96hrs 96-168 hrs Claassen J, Rahman SA, Huang Y, Frey H, Schmidt M, Albers D, Falo CM, Park S, Agarwal S, Connolly ES, Kleinberg S (2015) Causal structure of brain physiology after brain injury. PLoS ONE.
  • 15. Explanation • Why is an individual’s brain swelling at a particular time? • What caused a specific stock market crash? • Who is at fault for a car accident?
  • 16. • Inference finds type-level relationships
 
 
 
 • Explanation tells us why specific events happened
 
 
 
 heart failure uncontrolled diabetes thyroid disfunction heart failure uncontrolled diabetes thyroid disfunction
  • 17. • Goals for explanation • Find causes of specific events automatically (no human in the loop) • Find causes of when, whether and how events occur • Approach: simulation to answer counterfactual queries C. Merck and S. Kleinberg. Causal explanation under indeterminism: A sampling approach. AAAI, 2016. https://github.com/kleinberg-lab/stoch_cf
  • 18. Counterfactual vs. Actual Distributions • P(•|¬A) is the counterfactual distribution of A
 
 • P(•|A) is the actual distribution of A

  • 19. Three Types of Explanation B because of A iff
 P (B|A) >> P (B|¬A) B hastened by A iff
 E[tB |A] << E[tB |¬A] B intensified by A iff
 E[mB |A] >> E[mB |¬A] 
 B despite A iff
 P (B|A) << P (B|¬A) B delayed by A iff
 E[tB |A] >> E[tB |¬A] B attenuated by A iff
 E[mB |A] << E[mB |¬A] 
 tB = time of B occurring mB = intensity (manner) of B occurring probability:
 timing:
 intensity:
  • 21. Hastening • 6->8, then 8->P • but 7->8 is a more reliable backup • probability raising finds
 “8->P despite 6->8” • but by analyzing timing we find
 “6->8 hastened 8->P”
  • 22. Diabetes Simulation • Run or Lunch alone would not have caused Hypoglycemia
 (see counterfactual dists) • Yet together they explain the Hypoglycemia
 (see actual distribution) • We see beyond the most recent event (Lunch) • We can measure quantitative strength of effect in mg/dL:
 E[ glu | R] - E[ glu | ¬R]
  • 23. Looking forward: personalized and pervasive sensing • Can we develop a fitbit for nutrition? • Find effect of food on glucose, causes of overeating
  • 24. Key open problems • Data with uncertain timings • Finding the “right” variables • Explanation beyond models
  • 25. Thanks! Teams @CUMC, Stevens NSF, NIH, JSMF And my group is hiring!