SlideShare una empresa de Scribd logo
1 de 109
#DEATHVSDATASCIENCE
Explainable Models for
Healthcare AI
Muhammad Aurangzeb Ahmad, Carly Eckert, Ankur Teredesai, Vikas Kumar
KenSci Inc.
https://mlhealthcare.github.io
Explainable Models in AI
Well, the computer
assures me that you
are fine and it has
given me 100
reasons for it.
Doctor, are you
sure that I am ok
now? My head
still hurts.
Learning Objectives of the Tutorial
• Why ?
• The AI & ML needs to provide explanations in Healthcare.
• What ?
• Type of Explanations do we need in Healthcare
• How ?
• To select the right machine learning algorithms when explanations
are needed
• Where ?
• Settings and different types of interpretable machine learning
models
• When?
• The past the present and the future of explainable AI in healthcare
Explainable By Any other Name.. is still explainable
• Explain: Make (an idea or situation)
clear to someone by describing it in more
detail or revealing relevant facts.
• Interpret: Explain the meaning of
(information or actions)
• Understand: Perceive the intended
meaning of (words, a language, or a
speaker)
• Comprehend: Grasp mentally;
understand
• Intelligible: Able to be understood;
comprehensible
[Dictionary, Oxford English 2018]
A (Brief) History of Need for Explainations
• Foundations of Logic: Clear and explicit reasoning, explainable to
almost anyone
• Bayes: Grounding Probability and inference on solid foundations
(18th century) [Bayes 1763]
• Goal of (early) AI: Mimic human reasoning mechanically
• Early Expert Systems in healthcare like MYCIN were explainable
systems
• Push around explainability when ensemble methods first came
about in the 1980s
• Current interest in explainability is because of widespread adoption
and deployment of machine learning systems
Explainable AI / Interpretable Machine Learning
• Explainable AI or interpretable
machine learning refers to giving
explanations of AI/machine learning
models to humans with domain
knowledge
[Craik 1967, Doshi-Velez 2014]
• Explanation: Why is the prediction
being made?
• Explanation to Human: The
explanation should be
comprehensible to humans in (i)
natural language (ii) easy to
understand representations
• Domain Knowledge: The
Standard Model Lagrangian
Explainable AI is More Than Models
Features Algorithm Model Parameters Model
Cognitive Capacity
Domain Knowledge
Each element constituent of the solution process needs to be explainable for
the solution to be truly explainable [Lipton 2016]
AI Solution User
Explanation Granularity
What Explanations Are Not?
• Explanation vs. Justification
• Explaining is giving reasons for the prediction in
terms of the model and the prediction [Biran 2014,
Biran 2017A, Biran 2017B]
• Justification is putting the explanation in a
context in a way that makes sense
• Justification does not have to correspond to how
the model actually works
• Explanation vs. Causality
• Explanations are (mostly) not causal
• Predictive models are not causal
• Predictive models are not prescriptive
• Example: End of life prediction for heart failure
patients
Audience Poll
How many people in the audience?
• Are Physicians/MDs
• Work in the healthcare domain
• Have built a machine learning model
• Work with healthcare data
• Plan to work with applied AI/ML in
healthcare in the near future
Healthcare AI
12
Factors: 7 ± 21. Heuristics
2. Rules based system Factors: 10s
Factors: 100s
~80% Care Decisions
~18% Care Decisions
~2% Care Decisions
How Are Decisions Made in
Healthcare?
3. ML based System
Why Do We Need Explanations in AI & Machine Learning?
Machine Learning is Everywhere!
Why Do We Need Explanations in Healthcare AI
Now?
More Implications
(known/unknown)
Why Do We Need Explanations in Healthcare
AI?
When fairness is critical:
• Any context where humans are required to provide
explanations so that people cannot hide behind machine
learning models [Al-Shedivat 2017B, Doshi-Velez 2014]
When consequences are far-reaching:
• Predictions can have far reaching consequences e.g.,
recommend an operation, recommend sending a patient to
hospice etc.
When the cost of a mistake is high:
• Ex: misclassification of a malignant tumor can be costly and
dangerous
When a new/unknown hypothesis is drawn:
• “It's not a human move. I've never seen a human play this
move.” [Fan Hui]
• Pneumonia patients with asthma had lower risk of dying
When Do We Need Explanations in Healthcare
AI?
• When compliance is required:
• GDPR
• Right to explanation
• Predictive performance is not
enough
[Doshi-Velez 2017]
Xkcd comics
Why Do We Need Explanations Now?
[Morsey 2018] [Hernandez 2018, Ross 2018]
IBM has a Watson Dilemma
[Ram 2018]
Why Do We Need Explanations Now?
[Caruana 2015, Caruana 2017]
Risk Prediction with Blackbox Models Misdiagnosis via Adversarial Attacks
[Finlayson 2018]
Bayesian Models Neural Nets Ensemble Models Markov Models
Statistical Models Graphical Models Reinforcement
Learning
Natural Language
Processing
Expert Systems Recommendation
Systems
[Chang 2009]
Supervised Learning
How Machine Learning Can Improve Healthcare
Across the Continuum of Care?
Age (years)
Healthcare
Utilization
0 80604020
Birth of two children
Death
10 30 50 70
Birth
Appendicitis
CHF
Colds, minor injuries
Diabetes
Optimize
hospital
efficiency
Optimize
clinic
operations
and benefit
to patient
Optimize
inpatient
flow
Prevent
sequelae
of
disease
Improve
operations
and
diagnose
illness
sooner
Chronic disease
management
(DM)
Chronic disease
management
(CHF)
Help patient
plan for end
of life needs
Prevent post
operative
complicatio
ns
Variation
analysis
No show prediction
LOS prediction
ROR
prediction
ED flow
suite
SSI prediction
Other predictions
Care management
RecSys
Predict disease
progression
Predict disease
complications
Rx noncompliance
prediction
EF Progression
prediction
Preventive care
EOL
prediction
Problems in Patient Flow
Operationalizing AI in Healthcare
Only a small fraction of real-world machine learning systems actually constitutes machine
learning code [Sculley 2015].
Data in Operationalized AI in Healthcare
• Syntactic Correctness
• Is the data in the correct format e.g., if the AI data pipeline requires sex for males as
‘m’ and the input data encodes it as ‘1’
• Morphological Correctness
• Is the data within the range of possible values e.g., a blood pressure of 500 does not
make sense
• Semantic Correctness
• Do the variables actually correspond to what the semantics that are being ascribed
to them e.g., a variable which encodes blood pressure as high vs. low will have
different semantics for children as compared to adults
Audience Quiz: Healthcare AI
Explanations are needed in healthcare AI to:
A. Foster trust in AI models is healthcare
B. Reduce the risk of ill-informed decision making
C. Compliant with regulations like GDPR
D. All of the above
Pillars of Explainable AI
7 Pillars of Explainable AI in Healthcare
Transparency Domain Sense Consistency Parsimony
FidelityTrust/
Performance
Generalizability
7 Pillars of Explainable AI in Patient Flow
transparency
domain sense
consistency
parsimony
generalizability
trust
fidelity
Transparency | Admission Prediction
Transparency
Ability of the machine learning algorithm,
model, and the features to be understand-
able by the user of the system.
Admission Prediction
What is the likelihood of the patient being
admitted to the hospital from the emergency
department.
Katherine is a data scientist
working at an ML company,
she specializes in explainable
ML.
She is generally healthy but
was recently seen by her PCP
for gallstones.
She now has RUQ abd pain
and a fever and is at County
Emergency Department for
evaluation and treatment.
Transparency
• The ML model for predicting Katherine’s likelihood of
admission gives her a high likelihood (0.62)
• Katherine’s physician has noted her age, health history,
and vital signs and is surprised by this elevated risk
score
• The physician knows that the risk model is a deep
learning model so he cannot understand how it is
working
• But, he can examines the top factors associated with
prediction
Transparent to Whom?
• Transparency may mean different
different things to different people
• Understanding Model Outputs:
• Understanding Algorithms:
• Understanding the algorithm may not
mean be sufficient:
Transparency | Simultability
• The whole model must be understandable
simultaneously
• One must be able to look at the model and
understand the whole model without too
much cognizing about the model [Lipton 2016]
• Example: While both Decision Trees are
explainable, Decision Tree B has the
property of Simultability but Decision Tree
A does not
Decision Tree A
Decision Tree B
Transparency | Decomposability
• Each component should also admit to an easy/intuitive explanation
• A linear model with highly engineered features vs. a linear model with simple
feature [Lipton 2016]
• Example: Model A is decomposable but Model B is not
Regression Variables
• Age
• Gender
• Race
• Diabetic
• Smoker
Target Variable
• Length of Stay
Regression Model A
Regression Variables
• W
•
F
•
Ww
Target Variable
• Length of Stay
Regression Model B
Transparency | Algorithmic
• Guarantee that a model will converge [Lipton 2016, Hill 2018]
• Models like Regression Models, SVM etc. have this property
• Deep Learning does not have this property
Transparency | Feedback & Scrutability
• Feedback Transparency refers to how change in the model will affect the model
prediction [Herlocker 2000]
• How do multi-objective optimization models affect each other e.g., optimizing
reduction of risk of readmission and reduction of length of stay
• Not an absolute requirement; a nice to have property
• Especially applicable to scrutable machine learning systems
Transparency | Examples
Transparent
• Falling Rule Lists
• GAM (Generalized Additive Models)
• GA2M (Generalized Additive Models with
interactions)
• LIME (Locally Interpretable Model Agnostic
Explanations)
• Naïve Bayes
• Regression Models
• Shapley Values
Semi-Transparent
• Shallow Ensembles
Non-Transparent
• Deep Learning
• SVM (Support Vector Machines)
• Gradient Boosting Models
Transparency | Locally Interpretable Model Explanations
• Locally Interpretable Model-agnostic Explanations [Riberio
2016a]
• Given a non-interpretable model create local interpretable
models that are good approximates of the data at the
local level
• The local model can be linear models like regression
models
• LIME simulates the distribution of the data locally
• Can be problematic if the decision boundary is not linear
locally
Transparency | Locally Interpretable Model Explanations
[Riberio 2016a,
Riberio 2016b]
Computer Vision
Example: Object
Detection
Healthcare Example: Chronic Kidney
Disease
Transparency | Shapley Values
• Game Theoretic Method for determining feature contributions [Strumbelj 2014, Lipovetsky
2001, Lundberg 2017]
• Each feature is a ‘player’ in a game where the prediction is the payout
• The Shapley value - a method to fairly distribute the ‘payout’ among the features.
Transparency | Shapley Values
5.6 days
6.1 days 4.9 days 6.9 days
5.8 days 6.2 days 5.7 days
5.3 days
Transparency in Deep Learning
• Deep Learning Models are not transparent
[Bach 2015]
• Features extract by deep learning models
gave different levels of transparency
• Many model agnostic methods exists to
extract explanations from deep learning
models
• Mimic models are also used widely for
creating interpretable models
Domain Sense | ED Census Prediction
ED Census Prediction
Predict the number of patients in the emergency
department (ED) at a given time
Domain Sense
The explanation should make sense in the domain of
application and to the user of the system
ED
Domain Sense
• Katherine and her physician, Dr. Monroe, discuss the problem of ED census
prediction
• The top features may be temporal as well as those related to the inpatient facility
(how many available beds) -- explanations are reconfigured to surface only
factors that are modifiable
• Wendy, a nurse, sees the new dashboard but the modifiable factors are not really
helpful to her. The explanation dashboard is then customized for Wendy.
• In the aggregate the top reason for high ED census is related to the lack of
available inpatient beds and not the influx of ED patients
Domain Sense | Explanations are Role-Based
• A physician requires different explanations as compared to a staffing planner
• Explanations need to be in the right language and also in the right context [Doshi-
Velez 2014, Druzdzel 1996]
• Making domain sense may require sacrificing or deemphasizing other
requirements for explanations like fidelity
• If actionability is required then the factors for explanations should reflect that
Domain Sense | Taxonomy of Explainable Factors
Domain Sense | Interpreting Risk Score
• Interpreting output from machine learning models may
also has an element of subjectivity
• Example: Predicting risk of readmission
• If a patient has a risk score 0.62, what does that mean?
• Some domain knowledge may be needed to map it to a
manageable risk
• Underlying distributions may also change, and thus the
interpretation may also change, i.e. CHF cohort
Domain Sense | Actionability of Explanations in
Healthcare
• Actionability is context dependent and role
dependent
• Actionability is NOT causality
• There may be circumstances were actionability is
not possible
• Temporality associated with modifiable factors: flu
vaccine is a 1 minute task vs. weight loss
programs which can take months
• Actionability does not presume causality but it can
be used to devise interventions or to investigate
novel associations
If you take this
medicine then
you will be
fine in 4 days.
So if I take 4
times the
dosage then I
will be fine by
tonight!
Domain Sense | Narrative Roles
• Normal evidence is evidence that is expected to be present in many instances predicted to be in the class.
• With high importance , high effect is not surprising.
• Exceptional evidence is evidence that is not usually expected to be present.
• Contrarian evidence is strongly unexpected, since the effect has the opposite sign than expected.
[ Biren 2017A ]
Domain Sense | GAM (Generalized Additive Models)
• GAMs combine single-feature models
called shape functions through a linear
function [Lou 2012]
• Shape functions can be be arbitrarily
complex
• GAMs have more predictive power as
compared to linear models
Domain Sense | GA2M
• Generalized Additive Models with pairwise interactions (GA2M) [Caruana 2015, Caruana 2017]
• Accounts for interactions between the target variable and the feature space
• Possible to visualize the relationship between the target variable to gain insights into
why the prediction is being made
Domain Sense | GA2M
• There be a massive number of
pairwise interactions between a large
number of features e.g., n =1000
• GAMs reduces the number of
pairwise interactions to be
considered
Audience Quiz: Domain Sense
All Explanations in Healthcare AI are:
A. Actionable
B. Causal
C. Role Based
D. All of the above
Consistency | LWBS
Consistency
The explanation should be
consistent across different models
and across different runs of the
model
LWBS
Left without being seen refers to a
patient leaving the facility without
being seen by a physician
Factors
Prior LWBS
Insurance status
Past history
Collins Katherine 37 ABDOMINAL PAIN Monroe Private 16:52 314 0 24
Kennedy Susan 72 DYSPNEA Lewis Medicare 19:45 135 1 60
Factors
Temperature
Insurance status
Chief complaint
Age
~18:00
~22:00
Collins Katherine 37 ABDOMINAL PAIN Monroe Private 16:52 68 0 24
62
40
62
40
Consistency
• Dr. Monroe examines the reasons for a patient who is predicted to leave without being seen and notices that
the explanation for them leaving are different from what he observed 4 hours ago
• Upon investigation Katherine determines that the LIME model is being used for prediction which is non-
deterministic, hence differences in explanations
• Having different explanations for the same instance can be confusing for users, hence the need for
consistency
Consistency | Inconsistent Ranking
• Patient: Harrison Welles: age 72, COPD & chronic liver disease
• Use Case: Length of Stay Prediction
Population Ranking LIME Shapley Values
Min Arterial Pressure past 12 months Min Arterial Pressure past 12 months Min Arterial Pressure past 12 months
HR in ED <60 or >110 Max ALT past 12 months Max ALT past 12 months
Chronic Condition Dementia Min Sodium past 12 months Transfer Patient
Temperature in ED <34 or >38 Max Neutrophils past 12 months Min Sodium past 12 months
Chronic Condition Hyperlipidemia Median PaO2 : FiO2 past 12 months Max AST past 12 months
Chronic Condition Anemia Most Recent Glucose Max Neutrophils past 12 months
Weekend Admission Transfer Patient Mean Temperature past 12 months
Chronic Condition Diabetes Chronic Condition Acute Myocardial Infarction Most Recent Glucose
Chronic Condition Ischemic Heart Disease Chronic Condition Heart Failure Std dev PaO2 : FiO2 past 12 months
Chronic Condition Acute Myocardial Infarction Chronic Condition Diabetes Median HbA1c past 12 months
Consistency | Model Multiplicity vs Explanation Consistency
• Given the same dataset, multiple machine learning
algorithms can be constructed with similar
performance
• The explanations that are produced by multiple
explainable algorithms should be very similar if not the
same
• Wide divergence in explanations is a sign of problem
with explanations or with the algorithm(s)
Variable Model A Model B Model C
Age 1 1 5
Gender 2 4 6
Diabetic 3 5 1
Race 4 6 4
Smoker 5 2 3
Alcoholic 6 3 2
Variables for Length of Stay Prediction Ranked
Consistency | Evaluating Consistency: Human Evaluation
• Humans can evaluate quality of explanations across models
• Does not scale well
• Have humans provide explanations
Consistency | Evaluating Consistency: Machine Evaluation
• Kendall’s Tau [Kendall 1938]
• Measure of rank correlation i.e., the similarity of the orderings of the data when ranked by each of the
quantities
• Values range from 1 to -1
Consistency | Evaluating Consistency: Machine Evaluation
• Kendall’ W: [Kendall 1939]
• Where S is the sum of squared deviations:
• Where the rank and the mean rank are given as:
Consistency | Content Explanation Networks
• CENs are deep networks that generate parameters for context-specific probabilistic graphical models which
are further used for prediction and play the role of explanations [Al-Shedivat 2017A]
• Prediction and explanations are a joint task
• Local approximations align with the local decision boundaries
Counterexamples | ED Arrivals Prediction
• Wendy and Katherine are looking at ED arrivals prediction
• Katherine shows the explanation factors to Wendy but the performance of the corresponding model is
relatively low (Precision = 0.34, Recall = 0.46)
• Wendy states that getting explanations for ED arrivals prediction is not very important, performance takes
precedence
Exceptions to Explanations | Near Perfect Performance
• Models which have perfect, near perfect or
or (always) better than human performance
• Detection of Diabetic Retinopathy with near
perfect precision and recall
• Better than human detection of tumors in
radiology
• Pneumonia detection as good as humans
but with scalability
Gulshan 2016
Rajpurkar 2017, Mazurowski 2018
Exceptions to Explanations | Theoretical Guarantees
• PAC (Probably Approximately Correct) Learning
• SVMs for Linearly Sparse data
• Verification system for demonstrating that the output class
of certain neural network is constant across a desired
neighborhood [Pulina 2010]
• Verification Systems for neural networks with certain
assumptions e.g., for example, that only a subset of the
hidden units in the neural network are relevant to each
input
I have given up
convergence
guarantees for lent.
The Machine Learning Hipster
Limits of Explanations
“You can ask a human, but, you know, what cognitive psychologists have discovered is that when you ask a
human you’re not really getting at the decision process. They make a decision first, and then you ask, and then
they generate an explanation and that may not be the true explanation.”
- Peter Norvig
Explainability and Cognitive Limitations
• Machine Learning is used in problems where the size of the data and/or the number of variables is too
large for humans to analyze
• What if the most parsimonious model is indeed too complex for humans to analyze or comprehend?
• Ante-Hoc explanations may be impossible and post-hoc explanations would be incorrect
• Explanations may not be possible in some cases
Audience Quiz: Exceptions
What is a scenario when you may not want
explanations:
A. Theoretical performance guarantees
B. Better than human prediction performance
C. Whenever I want
D. 0.7 AUC
E. Both A & B
F. Both C & D
#DEATHVSDATASCIENCE
Break
So far we have covered:
• Motivation for Explanation in ML in
healthcare
• Transparency
• Domain Sense
• Consistency
• Counter examples to explanation
After the break:
• Parsimony
• Generalizability
• Trust
• Fidelity
• Discussion
Parsimony| Admission Disposition
Parsimony
• The explanation should be as simple as possible
• Applies to both the complexity of the explanation and
the number of features provided to explain
Admission Disposition
• Where in the hospital the patient should go once
they are admitted
• elevated standard deviation of albumin
• elevated heart rate variability
• low potassium
• min HR in ED
• max BMI
Patient Risk Profiles
Admissions Disposition
Sandra Robinson:
54% Observation Unit
Diagnostic Parsimony
• Healthcare in Practice
• Multiple competing hypothesis
• Hypothesis are constantly revised
• ”most commons” are termed so for a reason
• Occam’s Razor
• To derive a unifying diagnosis that can explain all of a patient’s symptoms
• Hickam’s Dictum
• A man can have as many diseases as he damn well pleases
Parsimony
• MDL (Minimum Description Length) and
Occam’s Razor
• Occam’s Razor in Machine Learning
[Domingos 1999]
• Occam’s First Razor
• Occam’s Second Razor
• Occam’s Razor in Explainable AI Learning
• The simplest explanation is not always the
best explanation
Chatton’sAnti-Razor
• “If three things are not enough to verify an affirmative
proposition about things, a fourth must be added,
and so on.”
• The Heliocentric models were not better predictors at
the time when they were first proposed
• The Flat Earth is a simpler hypothesis than the round
Earth model and is in concordance with a large
number of observations
Decision Lists | Falling Rule Lists
• Order of rules determines order of classification
• Probability of success decreases monotonically down the list
• Example: patients would be stratified into risk sets and the highest at-risk patients should
be considered first
• Greedy tree based models and Bayesian models for list construction
[Wang and Rudin, 2015]
Bayesian Rule Lists
• BRLs are decision lists--a series of if-then
statements
• BRLs discretize a high-dimensional,
multivariate feature space into a series of
simple, readily interpretable decision
statements.
• Experiments show that BRLs have predictive
accuracy on par with the current top ML
algorithms (approx. 85- 90% as effective) but
with models that are much more interpretable [Letham 2015]
Decision Lists | Interpretable Decision Sets
• Independent of if-then rules
• Allows Scrutability: Human feedback for correction
• Frequent item-set item with constrains
[Lakkaraju, Bach & Leskovec, 2016]
Right for the Right Reasons Model
• Models can be right for the wrong reasons [Ross 2017]
• Use domain knowledge to constrain explanations
• Training models with input gradient penalties
[Lakkaraju, Bach & Leskovec, 2016]
Generalizability | Models
• Dr. Warren is Katherine’s hospitalist. He is looking at his patients’ length of stay
prediction. He observes that many of the instances have similar explanations.
• However the explanations do not appear to be generalizable to the whole population
• When he looks into other cases and realizes that he had been looking at a specialized
population
ExplanationChief
Complaint
Dyspnea
Anemia
SOB
Fatigue
Sepsis
Abd pain
Fatigue
CKD
UTI
EF<40%
EF<40%
EF<40%
EF<40%
EF<40%
EF<40%
EF<40%
EF<40%
EF<40%
Generalizability | Models
• Dr. Warren is Katherine’s hospitalist. He is looking at his patients’ length of stay
prediction. He observes that many of the instances have similar explanations.
• However the explanations do not appear to be generalizable to the whole population
• When he looks into other cases and realizes that he had been looking at a specialized
population
Generalizability | Explanations
• Local Models: Models that give explanations at the level of an instance
• Examples: LIME, Shapley Values etc.
• Cohort Level Models: A type of global models where the explanations are generated
at the level of cohort
• Examples: Same as global models
• Global Models: Models that give explanations
• Examples: Decision Trees, Rule Based Models etc.
Generalizability | Algorithms
• Are the interpretable/explainable algorithms generalizable to all
predictive algorithm, a particular class of algorithms or tied to a
particular algorithm
• Model Agnostic Explanations:
• Examples: LIME, Shapley Values etc.
• Model Class Specific Explanations:
• Examples: Tree Explainers, Gradient Boost Explainers
• Model Specific Explanations:
• Examples: CENs, Decision Trees, Random Forest Explainer etc.
Generalizability | Dimensions of Explanations
Data
• What variables or features are most relevant for the prediction of length of stay?
Prediction
• Explain why certain patients are being predicted to have long length of stays
Model
• What are the patterns belonging to a particular category (long length of stay)
typically look like?
Prediction Decomposition
• Explain the model prediction for one
instance by measuring the difference
between the original prediction and the one
made with omitting a set of features
[Robnik-Sikonja and Kononenko 2008]
• Problem: Imputation may be needed in case
of missing values
Model Outputs
Rule Based Models Relative Variable Importance Case Based Models
Rules for Length of Stay
Prediction
IF age > 55 AND gender = male
AND condition = ‘COPD’ AND
complication = ‘YES’
THEN
Length of stay = long (> 7 days)
Top 5 Variables for Length of Stay
Prediction
Variable Importance
Age 0.45
Gender 0.37
Diabetic 0.32
Race 0.21
Smoker 0.14
Example cases for Length of Stay
Prediction
Patient X is predicted to have a length
of stay of 20 days because he is most
similar to these 5 patients who on
average had length of stay of 5 days
Druzdzel 1996
Audience Quiz: Parsimony
Why would one not use the simplest model:
A. It may not generalize
B. I like complicated models
C. The model is right for wrong reasons
D. Both A & B
E. Both B & C
Trust/Performance | ICU Transfer Prediction
Trust / Performance
• The expectation that the corresponding
predictive algorithm for explanations should have
a certain performance
ICU Transfer Prediction
• Predict if a patient on the hospital ward will
require transfer to the intensive care unit due to
increasing acuity of care needs
Steven Miller, Rm 436
low blood pressure
increasing FiO2 requirement
lethargy
81
DETERIORATION
ALERT
Trust | Prediction Performance
• Expectation that the predictive system has a
sufficiently high performance e.g., precision,
recall, AUC etc. [Lipton 2016, Hill 2018]
• Explanations accompanied with sub-par
predictions can foster distrust
• The model should perform sufficiently well on the
prediction task in its intended use
• Trauma patients: vital signs and lab criteria fulfill
criteria to trigger alarm, leads to increasing
numbers of false positives [Nguyen 2014]
Trust | Human Performance Parity
• The model has at least parity with the performance of human practitioners
• The model makes the same types of mistakes that a human makes
• Example: Model B has human performance parity
Model Precision Recall F-Score Accuracy
Physician’s Prediction 0.73 0.71 0.72 0.60
Model A 0.82 0.68 0.74 0.65
Model B 0.83 0.81 0.82 0.89
Results for Length of Stay Prediction (Long vs. Short Stays)
Trust | Counterpoint to Human-Parity
• Some models may not need parity if they do went on predicting for cases which human are
good at predictions
• The predictive system could be a hybrid of human and machine decision making
Model Precision Recall F-Score Accuracy
Physician’s Prediction 0.73 0.71 0.72 0.60
Model A 0.24 0.68 0.74 0.65
Model B 0.83 0.81 0.82 0.89
Results for Length of Stay Prediction (Long vs. Short Stays)
Trust | Performance vs. Explainability
Image Source: Easy Solutions
Trust | Performance, Explanation and Risk Trade-off
• Trade-off is between performance,
Explanation and Risk
• Problem with high risk like predicting end
of life may need explanations because the
cost of false positive may be very high
• Low risk problems like cost prediction may
not need explanations so optimization is
centered our prediction performance Performance
Risk
Trust | Explaining Deep Learning
[Craven 1996]
Trust | Explanations vs. Associations
• A CNN is trained to recognize objects in images
• A language generating RNN is trained to translate features of the CNN into words and captions
Trust | Explanations vs. Associations
• Hendricks et al created a system to generate explanations of bird classifications.
The system learns to:
• Classify bird species with 85% accuracy
• Associate image descriptions (discriminative features of the image) with class
definitions (image-independent discriminative features of the class)
• Limitations
• Limited (indirect at best) explanation of internal logic
• Limited utility for understanding classification errors
Trust | Explainability andAdversarial ML
[Szegedy et al., 2013, Ghorbani 2017]
Fidelity| Risk of Readmission
Fidelity
• The expectation that the explanation and the
predictive model align well with one another
Risk of Readmission
• Predict if the patient will be readmitted within a
particular span in time, i.e. 30 days from time of
discharge
Fidelity
• Dr. Warren examines the explanations for risk of readmission and discovers that the
explanations do not appear to be correct
• After ruling out non-determinism and lack of consistency as an explanation, Katherine
examines the data
• They discover there is problem with the data leading to incorrect explanations
• Fixing the data does not solve the problem of incorrect explanations completely
• Katherine discovers that the explanation models and the prediction models are different
Fidelity | Data Provenance
• The Explanation is going to be as good as the data
• Pneumonia Mortality Risk Prediction Example
• Low Quality data
• Instrumentation problems
• Censoring
• Incorrect explanations may be given because of problems in the data
• Constraints on data collect may also show up as constraints on the explanations
• Shifts in distributions of underlying data must be recognized
Fidelity | Models
• Fidelity with the underlying
phenomenon
• Readmission model which uses lunar
cycles as a feature: The model may
have good predictive power and the
feature may even be quite helpful but
the model does not have fidelity with
the underlying phenomenon
Fidelity | Explanations
• An explanation is Sound if it adheres to how the model actually works
• An Explanation is Complete if it encompasses the complete extent of the model
• Soundness and Completeness are relative
• The soundness of an explanation can also vary depending upon on the constraints
on the explanations
• Ante-Hoc models are perfectly sound by definition, Post-Hoc models can have
varying leveling of soundness
Fidelity | Explanations
• Fidelity with the prediction model i.e., explanations should align as close to the
predictive model as possible (Soundness)
• Ante-Hoc: Models where the predictive model and the explanation model is the
same
• Post-Hoc Models: : Models where the predictive model and the explanation model
are different
• Special Case: Mimic Models
Fidelity | Mimic Models
• Also known as Shadow, Surrogate or
Student Models
• Use the output (instead of the true
labels) from the complex model and the
training data to train an model which is
explainable [Bucilua 2006, Tan 2017]
• The performance of the student model
is usually quite good
• Example: Given a highly accurate SVM,
train a decision tree on the predicted
label of the SVM and the original data
Fidelity | Mimic Models for Deep Learning
Fidelity | Imputation and Explanations
• Electronic health record data is often sparse
• Imputation is often used to complete the necessary data fields
• How should imputed fields affect model explanations?
• Possible concerns with safety & user trust:
• If the explanation for a readmission risk score includes the results of a test that a physician has not
yet ordered, how will the physician react?
• Could it affect patient safety?
• If patient has a low readmission risk due to a ”low troponin” yet troponin has not been evaluated, the physician
may overlook that fact and assume that an acute coronary event has already been ruled out
[Saha et al. 2015]
BETA
• Black Box Explanation through Transparent Approximations
• BETA learns a compact two-level decision set in which each rule explains part of
the model behavior unambiguously
• Novel objective function so that the learning process is optimized for:
• High Fidelity (high agreement between explanation and the model)
• Low Unambiguity (little overlaps between decision rules in the explanation)
• High Interpretability (the explanation decision set is lightweight and small)
[Lakkaraju, Bach & Leskovec, 2016]
Interpreting Random Forests
• For each decision that a tree (or a forest)
makes there is a path (or paths) from the
root of the tree to the leaf
• The path consists of a series of
decisions, corresponding to a particular
feature, each of which contribute to the
final predictions
• Variable importance can be computed as
the contribution of each node for a
particular decision for all the decision
trees in the random forest
[Saabas 2014]
Audience Quiz: Fidelity
Ante-Hoc models are:
A. Always Sound
B. Mostly Sound
C. Always Complete
D. Mostly Complete
E. Both A & C
F. Both A & D
G. Both B & C
H. Both B & D
Conclusion: Explainability in Healthcare AI
Artificial Intelligence
Explanations
Assistive Intelligence
+
Clinical Expertise
References
Ahmad 2018 Muhammad Aurangzeb Ahmad, Carly Eckert, Greg McKelvey, Kiyana Zolfagar, Anam Zahid, Ankur Teredesai. Death vs. Data Science: Predicting End of Life IAAI February 2-6, 2018
Al-Shedivat 2017A Al-Shedivat, Maruan, Avinava Dubey, and Eric P. Xing. ”Contextual Explanation Networks.” arXiv preprint arXiv:1705.10301 (2017).
Al-Shedivat 2017B Al-Shedivat, Maruan, Avinava Dubey, and Eric P. Xing. "The Intriguing Properties of Model Explanations."
Bach 2015 Bach, Sebastian, et al. "On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation." PloS one 10.7 (2015): e0130140
Bayes 1763 Bayes, Thomas, Richard Price, and John Canton. "An essay towards solving a problem in the doctrine of chances." (1763): 370-418.
Biran 2014 Biran, Or, and Kathleen McKeown. "Justification narratives for individual classifications." In Proceedings of the AutoML workshop at ICML, vol. 2014. 2014.
Biran 2017A Biran, Or, and Kathleen R. McKeown. "Human-Centric Justification of Machine Learning Predictions." In IJCAI, pp. 1461-1467. 2017.
Biran 2017B Biran, Or, and Courtenay Cotton. "Explanation and justification in machine learning: A survey." In IJCAI-17 Workshop on Explainable AI (XAI), p. 8. 2017.
Bucilua 2006 Buciluǎ, Cristian, Rich Caruana, and Alexandru Niculescu-Mizil. "Model compression." In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 535-541. ACM,
2006.
Caruna 2015 Caruana, Rich, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. "Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission." In Proceedings of the 21th ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1721-1730. ACM, 2015.
Caruna 2017 Caruana, Rich, Sarah Tan, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, Noemie Elhadad et al. "Interactive Machine Learning via Transparent Modeling: Putting Human Experts in the Driver’s Seat.” IDEA 2017
Chang 2009 Chang, Jonathan, et al. “Reading tea leaves: How humans interpret topic models.” Advances in Neural Information Processing Systems. 2009
Craik 1967 Craik, Kenneth James Williams. The nature of explanation. Vol. 445. CUP Archive, 1967.
Craven 1996 Mark W. Craven and Jude W. Shavlik. Extracting tree-structured representations of trained networks. Advances in Neural Information Processing Systems, 1996.
Datta 2016 Datta, Anupam, Shayak Sen, and Yair Zick. "Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems." Security and Privacy (SP), 2016 IEEE Symposium on. IEEE, 2016
Doshi-Velez 2014 Doshi-Velez, Finale, and Been Kim. "Towards a rigorous science of interpretable machine learning." arXiv preprint arXiv:1702.08608 (2017).
Doshi-Velez 2017 Doshi-Velez, Finale, Mason Kortz, Ryan Budish, Chris Bavitz, Sam Gershman, David O'Brien, Stuart Schieber, James Waldo, David Weinberger, and Alexandra Wood. "Accountability of AI under the law: The role of
explanation." arXiv preprint arXiv:1711.01134 (2017).
Druzdzel 1996 Druzdzel, Marek J. "Qualitiative verbal explanations in bayesian belief networks." AISB QUARTERLY (1996): 43-54.
Farah 2014 Farah, M.J. Brain images, babies, and bathwater: Critiquing critiques of functional neuroimaging. Interpreting Neuroimages: An Introduction to the Technology and Its Limits 45, S19-S30 (2014)
Freitas 2014 Freitas, Alex A. "Comprehensible classification models: a position paper." ACM SIGKDD explorations newsletter 15, no. 1 (2014): 1-10.
References
Friedman 2001 Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning. Vol. 1. New York: Springer series in statistics, 2001.
Friedman 2008 Friedman, Jerome H., and Bogdan E. Popescu. "Predictive learning via rule ensembles." The Annals of Applied Statistics2, no. 3 (2008): 916-954.
Goldstein 2014 Goldstein, Alex, Adam Kapelner, Justin Bleich, and Emil Pitkin. "Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation." Journal of Computational and Graphical
Statistics 24, no. 1 (2015): 44-65.
Guidotti 2018 Guidotti, Riccardo, Anna Monreale, Franco Turini, Dino Pedreschi, and Fosca Giannotti. "A survey of methods for explaining black box models." arXiv preprint arXiv:1802.01933(2018).
Gunning 2016 David Gunning Explainable Artificial Intelligence (XAI) DARPA/I2O 2016
Gorbunov 2011 Gorbunov, K. Yu, and Vassily A. Lyubetsky. "The tree nearest on average to a given set of trees." Problems of Information Transmission 47, no. 3 (2011): 274.
Hall 2018 Hall, P., Gill, N., Kurka, M., Phan, W. (May 2018). Machine Learning Interpretability with H2O Driverless AI. http://docs.h2o.ai.
Hardt 2016 Hardt, Moritz, Eric Price, and Nati Srebro. "Equality of opportunity in supervised learning." In Advances in neural information processing systems, pp. 3315-3323. 2016.
Hastie 1990 T. Hastie and R. Tibshirani. Generalized additive models . Chapman and
Hall/CRC, 1990.
Herlocker 2000 Herlocker, Jonathan L., Joseph A. Konstan, and John Riedl. "Explaining collaborative filtering recommendations." In Proceedings of the 2000 ACM conference on Computer supported cooperative work, pp. 241-250.
ACM, 2000.
Hoffman 2017 Hoffman, Robert R., Shane T. Mueller, and Gary Klein. "Explaining Explanation, Part 2: Empirical Foundations." IEEE Intelligent Systems 32, no. 4 (2017): 78-86.
Hutson 2018 Hutson, Matthew. "Has artificial intelligence become alchemy?." (2018): 478-478.
Koh 2017 Koh, Pang Wei, and Percy Liang. "Understanding black-box predictions via influence functions." arXiv preprint arXiv:1703.04730 (2017).Harvard
Kulesza 2014 Kulesza, Todd. "Personalizing machine learning systems with explanatory debugging." (2014).
Lakkaraju 2016 Himabindu Lakkaraju, Stephen H. Bach, and Jure Leskovec. “Interpretable decision sets: A joint framework for description and prediction.” Proc. 22nd ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining.
ACM, 2016.
Lei 2017 Jing Lei, Max G’Sell, Alessandro Rinaldo, Ryan J. Tibshirani, and Larry Wasserman. Distribution-free predictive inference for regression. Journal of the American Statistical Association, 2017
Lipovetsky 2001 Lipovetsky, Stan, and Michael Conklin. "Analysis of regression in game theory approach." Applied Stochastic Models in Business and Industry 17.4 (2001): 319-330
Lipton 2016 Lipton, Zachary C. ”The mythos of model interpretability.” arXiv preprint
arXiv:1606.03490 (2016).
Lipton 2017 Lipton, Zachary C. "The Doctor Just Won't Accept That!." arXiv preprint arXiv:1711.08037 (2017).
Lou 2013 Lou, Yin, Rich Caruana, Johannes Gehrke, and Giles Hooker. ”Accurate intelligible models with pairwise interactions.” In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and
data mining, pp. 623-631. ACM, 2013.
Lundberg 2017 Lundberg, Scott M., and Su-In Lee. "A unified approach to interpreting model predictions." In Advances in Neural Information Processing Systems, pp. 4765-4774. 2017.
Miller 2017A Miller, Tim. ”Explanation in artificial intelligence: Insights from the social sciences.” arXiv preprint arXiv:1706.07269 (2017).
Miller 2017B Miller, Tim, Piers Howe, and Liz Sonenberg. "Explainable AI: Beware of inmates running the asylum." In IJCAI-17 Workshop on Explainable AI (XAI), vol. 36. 2017.
Montavon 2017 Montavon, Grégoire, Sebastian Lapuschkin, Alexander Binder, Wojciech Samek, and Klaus-Robert Müller. "Explaining nonlinear classification decisions with deep taylor decomposition." Pattern Recognition 65 (2017):
211-222.
References
Moosavi-Dezfooli
2016
Moosavi-Dezfooli, Seyed-Mohsen, Alhussein Fawzi, and Pascal Frossard. "Deepfool: a simple and accurate method to fool deep neural networks." In Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition, pp. 2574-2582. 2016.
Morstatter 2016 Fred Morstatter and Huan Liu Measuring Topic Interpretability with Crowdsourcing KDD Nuggets November 2016
Nott 2017 Nott, George “Google's research chief questions value of 'Explainable AI'” Computer World 23 June, 2017
Olah 2018 Olah, Chris, Arvind Satyanarayan, Ian Johnson, Shan Carter, Ludwig Schubert, Katherine Ye, and Alexander Mordvintsev. "The building blocks of interpretability." Distill 3, no. 3 (2018): e10.
Perez 2004 Pérez, Jesus Maria, Javier Muguerza, Olatz Arbelaitz, and Ibai Gurrutxaga. "A new algorithm to build consolidated trees: study of the error rate and steadiness." In Intelligent Information Processing and Web Mining,
pp. 79-88. Springer, Berlin, Heidelberg, 2004.
Quinlan, J. Ross. "Some elements of machine learning." In International Conference on Inductive Logic Programming, pp. 15-18. Springer, Berlin, Heidelberg, 1999.
Ras 2018 Ras, Gabrielle, Pim Haselager, and Marcel van Gerven. "Explanation Methods in Deep Learning: Users, Values, Concerns and Challenges." arXiv preprint arXiv:1803.07517(2018).
Ribeiro 2016a Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. ”Why should i trust you?: Explaining the predictions of any classifier.” In Proceedings
of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135-1144. ACM, 2016.
Ribeiro 2016b Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin Introduction to Local Interpretable Model-Agnostic Explanations (LIME) August 12, 2016 Oriely Media
Ross 2018 Ross, Casey IBM’s Watson supercomputer recommended ‘unsafe and incorrect’ cancer treatments, internal documents show Stat News July 25, 2018
Ross 2017 Ross, Andrew Slavin, Michael C. Hughes, and Finale Doshi-Velez. "Right for the right reasons: Training differentiable models by constraining their explanations." arXiv preprint arXiv:1703.03717 (2017).
Saabas 2014 Saabas, Ando. Interpreting random forests Data Dive Blog 2014
Shrikumar 2017 Shrikumar, Avanti, Peyton Greenside, and Anshul Kundaje. "Learning important features through propagating activation differences." arXiv preprint arXiv:1704.02685 (2017)
Strumbelj 2014 Strumbelj, Erik, and Igor Kononenko. "Explaining prediction models and individual predictions with feature contributions." Knowledge and information systems 41.3 (2014): 647-665.
Tan 2017 Tan, Sarah, Rich Caruana, Giles Hooker, and Yin Lou. "Detecting Bias in Black-Box Models Using Transparent Model Distillation." arXiv preprint arXiv:1710.06169 (2017).
Turing 1950 Machinery, Computing. "Computing machinery and intelligence-AM Turing." Mind 59, no. 236 (1950): 433.
Ustun 2016 Ustun, Berk, and Cynthia Rudin. ”Supersparse linear integer models for optimized medical scoring systems.” Machine Learning 102, no. 3 (2016):349-391.
Wang 2015 Wang, Fulton, and Cynthia Rudin. ”Falling rule lists.” In Artificial Intelligence and Statistics, pp. 1013-1022. 2015.
Weld 2018 Weld, Daniel S., and Gagan Bansal. "Intelligible Artificial Intelligence." arXiv preprint arXiv:1803.04263 (2018).
Weller 2017 Weller, Adrian. "Challenges for transparency." arXiv preprint arXiv:1708.01870 (2017).
Wick 1992 M. R. Wick and W. B. Thompson. Reconstructive expert system explanation. Artificial Intelligence, 54(1- 2):33–70, 1992
Yang 2016 Yang, Hongyu, Cynthia Rudin, and Margo Seltzer. ”Scalable Bayesian rule lists.” arXiv preprint arXiv:1602.08610 (2016).

Más contenido relacionado

La actualidad más candente

Explainable AI in Industry (WWW 2020 Tutorial)
Explainable AI in Industry (WWW 2020 Tutorial)Explainable AI in Industry (WWW 2020 Tutorial)
Explainable AI in Industry (WWW 2020 Tutorial)Krishnaram Kenthapadi
 
Explainability and bias in AI
Explainability and bias in AIExplainability and bias in AI
Explainability and bias in AIBill Liu
 
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...SlideTeam
 
Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)Hayim Makabee
 
Building trust through Explainable AI
Building trust through Explainable AIBuilding trust through Explainable AI
Building trust through Explainable AIPeet Denny
 
Difference between Artificial Intelligence, Machine Learning, Deep Learning a...
Difference between Artificial Intelligence, Machine Learning, Deep Learning a...Difference between Artificial Intelligence, Machine Learning, Deep Learning a...
Difference between Artificial Intelligence, Machine Learning, Deep Learning a...Sanjay Srivastava
 
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...Edureka!
 
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete DeckAI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete DeckSlideTeam
 
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...SlideTeam
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Krishnaram Kenthapadi
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
 
Explainability for Natural Language Processing
Explainability for Natural Language ProcessingExplainability for Natural Language Processing
Explainability for Natural Language ProcessingYunyao Li
 
Generative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveGenerative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveHuahai Yang
 
Intro to Machine Learning & AI
Intro to Machine Learning & AIIntro to Machine Learning & AI
Intro to Machine Learning & AIMostafa Elsheikh
 
Explainable AI - making ML and DL models more interpretable
Explainable AI - making ML and DL models more interpretableExplainable AI - making ML and DL models more interpretable
Explainable AI - making ML and DL models more interpretableAditya Bhattacharya
 
Machine Learning
Machine LearningMachine Learning
Machine LearningVivek Garg
 
A Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningA Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningHaptik
 

La actualidad más candente (20)

Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Explainable AI
Explainable AIExplainable AI
Explainable AI
 
Explainable AI in Industry (WWW 2020 Tutorial)
Explainable AI in Industry (WWW 2020 Tutorial)Explainable AI in Industry (WWW 2020 Tutorial)
Explainable AI in Industry (WWW 2020 Tutorial)
 
Explainability and bias in AI
Explainability and bias in AIExplainability and bias in AI
Explainability and bias in AI
 
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...
Artificial Intelligence Machine Learning Deep Learning Ppt Powerpoint Present...
 
Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)Explainable Machine Learning (Explainable ML)
Explainable Machine Learning (Explainable ML)
 
Building trust through Explainable AI
Building trust through Explainable AIBuilding trust through Explainable AI
Building trust through Explainable AI
 
Difference between Artificial Intelligence, Machine Learning, Deep Learning a...
Difference between Artificial Intelligence, Machine Learning, Deep Learning a...Difference between Artificial Intelligence, Machine Learning, Deep Learning a...
Difference between Artificial Intelligence, Machine Learning, Deep Learning a...
 
Journey of Generative AI
Journey of Generative AIJourney of Generative AI
Journey of Generative AI
 
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
AI vs Machine Learning vs Deep Learning | Machine Learning Training with Pyth...
 
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete DeckAI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
 
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
Artificial Intelligence And Machine Learning PowerPoint Presentation Slides C...
 
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
Responsible AI in Industry (Tutorials at AAAI 2021, FAccT 2021, and WWW 2021)
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
 
Explainability for Natural Language Processing
Explainability for Natural Language ProcessingExplainability for Natural Language Processing
Explainability for Natural Language Processing
 
Generative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveGenerative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's Perspective
 
Intro to Machine Learning & AI
Intro to Machine Learning & AIIntro to Machine Learning & AI
Intro to Machine Learning & AI
 
Explainable AI - making ML and DL models more interpretable
Explainable AI - making ML and DL models more interpretableExplainable AI - making ML and DL models more interpretable
Explainable AI - making ML and DL models more interpretable
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
A Friendly Introduction to Machine Learning
A Friendly Introduction to Machine LearningA Friendly Introduction to Machine Learning
A Friendly Introduction to Machine Learning
 

Similar a Explainable AI in Healthcare

​​Explainability in AI and Recommender systems: let’s make it interactive!
​​Explainability in AI and Recommender systems: let’s make it interactive!​​Explainability in AI and Recommender systems: let’s make it interactive!
​​Explainability in AI and Recommender systems: let’s make it interactive!Eindhoven University of Technology / JADS
 
Hima_Lakkaraju_XAI_ShortCourse.pptx
Hima_Lakkaraju_XAI_ShortCourse.pptxHima_Lakkaraju_XAI_ShortCourse.pptx
Hima_Lakkaraju_XAI_ShortCourse.pptxPhanThDuy
 
Rsqrd AI: Application of Explanation Model in Healthcare
Rsqrd AI: Application of Explanation Model in HealthcareRsqrd AI: Application of Explanation Model in Healthcare
Rsqrd AI: Application of Explanation Model in HealthcareSanjana Chowdhury
 
Lecture1-Introduction-Jan18-2021.pptx
Lecture1-Introduction-Jan18-2021.pptxLecture1-Introduction-Jan18-2021.pptx
Lecture1-Introduction-Jan18-2021.pptxSangeetaTripathi8
 
Paper OneLength- 1000- 1200 words- 3-5-4 pages- exclusive of the Work.docx
Paper OneLength- 1000- 1200 words- 3-5-4 pages- exclusive of the Work.docxPaper OneLength- 1000- 1200 words- 3-5-4 pages- exclusive of the Work.docx
Paper OneLength- 1000- 1200 words- 3-5-4 pages- exclusive of the Work.docxestefana2345678
 
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...Analytics India Magazine
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedKrishnaram Kenthapadi
 
Research Critique GuidelinesQuantitative StudyBackground of .docx
Research Critique GuidelinesQuantitative StudyBackground of .docxResearch Critique GuidelinesQuantitative StudyBackground of .docx
Research Critique GuidelinesQuantitative StudyBackground of .docxverad6
 
Generative Analysis Overview
Generative Analysis OverviewGenerative Analysis Overview
Generative Analysis OverviewJim Arlow
 
Trustworthiness of AI based predictions Aachen 2024
Trustworthiness of AI based predictions Aachen 2024Trustworthiness of AI based predictions Aachen 2024
Trustworthiness of AI based predictions Aachen 2024EwoutSteyerberg1
 
Ai demystified for HR and TA leaders
Ai demystified for HR and TA leadersAi demystified for HR and TA leaders
Ai demystified for HR and TA leadersAntonia Macrides
 
Becoming Datacentric
Becoming DatacentricBecoming Datacentric
Becoming DatacentricTimothy Cook
 
2018.01.25 rune sætre_triallecture_xai_v2
2018.01.25 rune sætre_triallecture_xai_v22018.01.25 rune sætre_triallecture_xai_v2
2018.01.25 rune sætre_triallecture_xai_v2Rune Sætre
 
Data Science Salon: nterpretable Predictive Models in the Healthcare Domain
Data Science Salon: nterpretable Predictive Models in the Healthcare DomainData Science Salon: nterpretable Predictive Models in the Healthcare Domain
Data Science Salon: nterpretable Predictive Models in the Healthcare DomainFormulatedby
 
AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?Srinath Perera
 
ITS 832Chapter 5From Building a Model to Adaptive Robust.docx
ITS 832Chapter 5From Building a Model to Adaptive Robust.docxITS 832Chapter 5From Building a Model to Adaptive Robust.docx
ITS 832Chapter 5From Building a Model to Adaptive Robust.docxdonnajames55
 
AI Driven Product Innovation
AI Driven Product InnovationAI Driven Product Innovation
AI Driven Product Innovationebelani
 
AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19Xavier Amatriain
 

Similar a Explainable AI in Healthcare (20)

​​Explainability in AI and Recommender systems: let’s make it interactive!
​​Explainability in AI and Recommender systems: let’s make it interactive!​​Explainability in AI and Recommender systems: let’s make it interactive!
​​Explainability in AI and Recommender systems: let’s make it interactive!
 
Hima_Lakkaraju_XAI_ShortCourse.pptx
Hima_Lakkaraju_XAI_ShortCourse.pptxHima_Lakkaraju_XAI_ShortCourse.pptx
Hima_Lakkaraju_XAI_ShortCourse.pptx
 
Rsqrd AI: Application of Explanation Model in Healthcare
Rsqrd AI: Application of Explanation Model in HealthcareRsqrd AI: Application of Explanation Model in Healthcare
Rsqrd AI: Application of Explanation Model in Healthcare
 
Lecture1-Introduction-Jan18-2021.pptx
Lecture1-Introduction-Jan18-2021.pptxLecture1-Introduction-Jan18-2021.pptx
Lecture1-Introduction-Jan18-2021.pptx
 
Paper OneLength- 1000- 1200 words- 3-5-4 pages- exclusive of the Work.docx
Paper OneLength- 1000- 1200 words- 3-5-4 pages- exclusive of the Work.docxPaper OneLength- 1000- 1200 words- 3-5-4 pages- exclusive of the Work.docx
Paper OneLength- 1000- 1200 words- 3-5-4 pages- exclusive of the Work.docx
 
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
 
Responsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons LearnedResponsible AI in Industry: Practical Challenges and Lessons Learned
Responsible AI in Industry: Practical Challenges and Lessons Learned
 
Research Critique GuidelinesQuantitative StudyBackground of .docx
Research Critique GuidelinesQuantitative StudyBackground of .docxResearch Critique GuidelinesQuantitative StudyBackground of .docx
Research Critique GuidelinesQuantitative StudyBackground of .docx
 
Generative Analysis Overview
Generative Analysis OverviewGenerative Analysis Overview
Generative Analysis Overview
 
Mind the Semantic Gap
Mind the Semantic GapMind the Semantic Gap
Mind the Semantic Gap
 
Trustworthiness of AI based predictions Aachen 2024
Trustworthiness of AI based predictions Aachen 2024Trustworthiness of AI based predictions Aachen 2024
Trustworthiness of AI based predictions Aachen 2024
 
Ai demystified for HR and TA leaders
Ai demystified for HR and TA leadersAi demystified for HR and TA leaders
Ai demystified for HR and TA leaders
 
Becoming Datacentric
Becoming DatacentricBecoming Datacentric
Becoming Datacentric
 
2018.01.25 rune sætre_triallecture_xai_v2
2018.01.25 rune sætre_triallecture_xai_v22018.01.25 rune sætre_triallecture_xai_v2
2018.01.25 rune sætre_triallecture_xai_v2
 
Data Science Salon: nterpretable Predictive Models in the Healthcare Domain
Data Science Salon: nterpretable Predictive Models in the Healthcare DomainData Science Salon: nterpretable Predictive Models in the Healthcare Domain
Data Science Salon: nterpretable Predictive Models in the Healthcare Domain
 
AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?AI in the Real World: Challenges, and Risks and how to handle them?
AI in the Real World: Challenges, and Risks and how to handle them?
 
KGCTutorial_AIISC_2022.pptx
KGCTutorial_AIISC_2022.pptxKGCTutorial_AIISC_2022.pptx
KGCTutorial_AIISC_2022.pptx
 
ITS 832Chapter 5From Building a Model to Adaptive Robust.docx
ITS 832Chapter 5From Building a Model to Adaptive Robust.docxITS 832Chapter 5From Building a Model to Adaptive Robust.docx
ITS 832Chapter 5From Building a Model to Adaptive Robust.docx
 
AI Driven Product Innovation
AI Driven Product InnovationAI Driven Product Innovation
AI Driven Product Innovation
 
AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19
 

Último

PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINsankalpkumarsahoo174
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 

Último (20)

PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 

Explainable AI in Healthcare

  • 1. #DEATHVSDATASCIENCE Explainable Models for Healthcare AI Muhammad Aurangzeb Ahmad, Carly Eckert, Ankur Teredesai, Vikas Kumar KenSci Inc. https://mlhealthcare.github.io
  • 3. Well, the computer assures me that you are fine and it has given me 100 reasons for it. Doctor, are you sure that I am ok now? My head still hurts.
  • 4. Learning Objectives of the Tutorial • Why ? • The AI & ML needs to provide explanations in Healthcare. • What ? • Type of Explanations do we need in Healthcare • How ? • To select the right machine learning algorithms when explanations are needed • Where ? • Settings and different types of interpretable machine learning models • When? • The past the present and the future of explainable AI in healthcare
  • 5. Explainable By Any other Name.. is still explainable • Explain: Make (an idea or situation) clear to someone by describing it in more detail or revealing relevant facts. • Interpret: Explain the meaning of (information or actions) • Understand: Perceive the intended meaning of (words, a language, or a speaker) • Comprehend: Grasp mentally; understand • Intelligible: Able to be understood; comprehensible [Dictionary, Oxford English 2018]
  • 6. A (Brief) History of Need for Explainations • Foundations of Logic: Clear and explicit reasoning, explainable to almost anyone • Bayes: Grounding Probability and inference on solid foundations (18th century) [Bayes 1763] • Goal of (early) AI: Mimic human reasoning mechanically • Early Expert Systems in healthcare like MYCIN were explainable systems • Push around explainability when ensemble methods first came about in the 1980s • Current interest in explainability is because of widespread adoption and deployment of machine learning systems
  • 7. Explainable AI / Interpretable Machine Learning • Explainable AI or interpretable machine learning refers to giving explanations of AI/machine learning models to humans with domain knowledge [Craik 1967, Doshi-Velez 2014] • Explanation: Why is the prediction being made? • Explanation to Human: The explanation should be comprehensible to humans in (i) natural language (ii) easy to understand representations • Domain Knowledge: The Standard Model Lagrangian
  • 8. Explainable AI is More Than Models Features Algorithm Model Parameters Model Cognitive Capacity Domain Knowledge Each element constituent of the solution process needs to be explainable for the solution to be truly explainable [Lipton 2016] AI Solution User Explanation Granularity
  • 9. What Explanations Are Not? • Explanation vs. Justification • Explaining is giving reasons for the prediction in terms of the model and the prediction [Biran 2014, Biran 2017A, Biran 2017B] • Justification is putting the explanation in a context in a way that makes sense • Justification does not have to correspond to how the model actually works • Explanation vs. Causality • Explanations are (mostly) not causal • Predictive models are not causal • Predictive models are not prescriptive • Example: End of life prediction for heart failure patients
  • 10. Audience Poll How many people in the audience? • Are Physicians/MDs • Work in the healthcare domain • Have built a machine learning model • Work with healthcare data • Plan to work with applied AI/ML in healthcare in the near future
  • 12. 12 Factors: 7 ± 21. Heuristics 2. Rules based system Factors: 10s Factors: 100s ~80% Care Decisions ~18% Care Decisions ~2% Care Decisions How Are Decisions Made in Healthcare? 3. ML based System
  • 13. Why Do We Need Explanations in AI & Machine Learning? Machine Learning is Everywhere!
  • 14. Why Do We Need Explanations in Healthcare AI Now? More Implications (known/unknown)
  • 15. Why Do We Need Explanations in Healthcare AI? When fairness is critical: • Any context where humans are required to provide explanations so that people cannot hide behind machine learning models [Al-Shedivat 2017B, Doshi-Velez 2014] When consequences are far-reaching: • Predictions can have far reaching consequences e.g., recommend an operation, recommend sending a patient to hospice etc. When the cost of a mistake is high: • Ex: misclassification of a malignant tumor can be costly and dangerous When a new/unknown hypothesis is drawn: • “It's not a human move. I've never seen a human play this move.” [Fan Hui] • Pneumonia patients with asthma had lower risk of dying
  • 16. When Do We Need Explanations in Healthcare AI? • When compliance is required: • GDPR • Right to explanation • Predictive performance is not enough [Doshi-Velez 2017] Xkcd comics
  • 17. Why Do We Need Explanations Now? [Morsey 2018] [Hernandez 2018, Ross 2018] IBM has a Watson Dilemma [Ram 2018]
  • 18. Why Do We Need Explanations Now? [Caruana 2015, Caruana 2017] Risk Prediction with Blackbox Models Misdiagnosis via Adversarial Attacks [Finlayson 2018]
  • 19. Bayesian Models Neural Nets Ensemble Models Markov Models Statistical Models Graphical Models Reinforcement Learning Natural Language Processing Expert Systems Recommendation Systems [Chang 2009] Supervised Learning
  • 20. How Machine Learning Can Improve Healthcare Across the Continuum of Care? Age (years) Healthcare Utilization 0 80604020 Birth of two children Death 10 30 50 70 Birth Appendicitis CHF Colds, minor injuries Diabetes Optimize hospital efficiency Optimize clinic operations and benefit to patient Optimize inpatient flow Prevent sequelae of disease Improve operations and diagnose illness sooner Chronic disease management (DM) Chronic disease management (CHF) Help patient plan for end of life needs Prevent post operative complicatio ns Variation analysis No show prediction LOS prediction ROR prediction ED flow suite SSI prediction Other predictions Care management RecSys Predict disease progression Predict disease complications Rx noncompliance prediction EF Progression prediction Preventive care EOL prediction
  • 22. Operationalizing AI in Healthcare Only a small fraction of real-world machine learning systems actually constitutes machine learning code [Sculley 2015].
  • 23. Data in Operationalized AI in Healthcare • Syntactic Correctness • Is the data in the correct format e.g., if the AI data pipeline requires sex for males as ‘m’ and the input data encodes it as ‘1’ • Morphological Correctness • Is the data within the range of possible values e.g., a blood pressure of 500 does not make sense • Semantic Correctness • Do the variables actually correspond to what the semantics that are being ascribed to them e.g., a variable which encodes blood pressure as high vs. low will have different semantics for children as compared to adults
  • 24. Audience Quiz: Healthcare AI Explanations are needed in healthcare AI to: A. Foster trust in AI models is healthcare B. Reduce the risk of ill-informed decision making C. Compliant with regulations like GDPR D. All of the above
  • 26. 7 Pillars of Explainable AI in Healthcare Transparency Domain Sense Consistency Parsimony FidelityTrust/ Performance Generalizability
  • 27. 7 Pillars of Explainable AI in Patient Flow transparency domain sense consistency parsimony generalizability trust fidelity
  • 28. Transparency | Admission Prediction Transparency Ability of the machine learning algorithm, model, and the features to be understand- able by the user of the system. Admission Prediction What is the likelihood of the patient being admitted to the hospital from the emergency department. Katherine is a data scientist working at an ML company, she specializes in explainable ML. She is generally healthy but was recently seen by her PCP for gallstones. She now has RUQ abd pain and a fever and is at County Emergency Department for evaluation and treatment.
  • 29. Transparency • The ML model for predicting Katherine’s likelihood of admission gives her a high likelihood (0.62) • Katherine’s physician has noted her age, health history, and vital signs and is surprised by this elevated risk score • The physician knows that the risk model is a deep learning model so he cannot understand how it is working • But, he can examines the top factors associated with prediction
  • 30. Transparent to Whom? • Transparency may mean different different things to different people • Understanding Model Outputs: • Understanding Algorithms: • Understanding the algorithm may not mean be sufficient:
  • 31. Transparency | Simultability • The whole model must be understandable simultaneously • One must be able to look at the model and understand the whole model without too much cognizing about the model [Lipton 2016] • Example: While both Decision Trees are explainable, Decision Tree B has the property of Simultability but Decision Tree A does not Decision Tree A Decision Tree B
  • 32. Transparency | Decomposability • Each component should also admit to an easy/intuitive explanation • A linear model with highly engineered features vs. a linear model with simple feature [Lipton 2016] • Example: Model A is decomposable but Model B is not Regression Variables • Age • Gender • Race • Diabetic • Smoker Target Variable • Length of Stay Regression Model A Regression Variables • W • F • Ww Target Variable • Length of Stay Regression Model B
  • 33. Transparency | Algorithmic • Guarantee that a model will converge [Lipton 2016, Hill 2018] • Models like Regression Models, SVM etc. have this property • Deep Learning does not have this property
  • 34. Transparency | Feedback & Scrutability • Feedback Transparency refers to how change in the model will affect the model prediction [Herlocker 2000] • How do multi-objective optimization models affect each other e.g., optimizing reduction of risk of readmission and reduction of length of stay • Not an absolute requirement; a nice to have property • Especially applicable to scrutable machine learning systems
  • 35. Transparency | Examples Transparent • Falling Rule Lists • GAM (Generalized Additive Models) • GA2M (Generalized Additive Models with interactions) • LIME (Locally Interpretable Model Agnostic Explanations) • Naïve Bayes • Regression Models • Shapley Values Semi-Transparent • Shallow Ensembles Non-Transparent • Deep Learning • SVM (Support Vector Machines) • Gradient Boosting Models
  • 36. Transparency | Locally Interpretable Model Explanations • Locally Interpretable Model-agnostic Explanations [Riberio 2016a] • Given a non-interpretable model create local interpretable models that are good approximates of the data at the local level • The local model can be linear models like regression models • LIME simulates the distribution of the data locally • Can be problematic if the decision boundary is not linear locally
  • 37. Transparency | Locally Interpretable Model Explanations [Riberio 2016a, Riberio 2016b] Computer Vision Example: Object Detection Healthcare Example: Chronic Kidney Disease
  • 38. Transparency | Shapley Values • Game Theoretic Method for determining feature contributions [Strumbelj 2014, Lipovetsky 2001, Lundberg 2017] • Each feature is a ‘player’ in a game where the prediction is the payout • The Shapley value - a method to fairly distribute the ‘payout’ among the features.
  • 39. Transparency | Shapley Values 5.6 days 6.1 days 4.9 days 6.9 days 5.8 days 6.2 days 5.7 days 5.3 days
  • 40. Transparency in Deep Learning • Deep Learning Models are not transparent [Bach 2015] • Features extract by deep learning models gave different levels of transparency • Many model agnostic methods exists to extract explanations from deep learning models • Mimic models are also used widely for creating interpretable models
  • 41. Domain Sense | ED Census Prediction ED Census Prediction Predict the number of patients in the emergency department (ED) at a given time Domain Sense The explanation should make sense in the domain of application and to the user of the system ED
  • 42. Domain Sense • Katherine and her physician, Dr. Monroe, discuss the problem of ED census prediction • The top features may be temporal as well as those related to the inpatient facility (how many available beds) -- explanations are reconfigured to surface only factors that are modifiable • Wendy, a nurse, sees the new dashboard but the modifiable factors are not really helpful to her. The explanation dashboard is then customized for Wendy. • In the aggregate the top reason for high ED census is related to the lack of available inpatient beds and not the influx of ED patients
  • 43. Domain Sense | Explanations are Role-Based • A physician requires different explanations as compared to a staffing planner • Explanations need to be in the right language and also in the right context [Doshi- Velez 2014, Druzdzel 1996] • Making domain sense may require sacrificing or deemphasizing other requirements for explanations like fidelity • If actionability is required then the factors for explanations should reflect that
  • 44. Domain Sense | Taxonomy of Explainable Factors
  • 45. Domain Sense | Interpreting Risk Score • Interpreting output from machine learning models may also has an element of subjectivity • Example: Predicting risk of readmission • If a patient has a risk score 0.62, what does that mean? • Some domain knowledge may be needed to map it to a manageable risk • Underlying distributions may also change, and thus the interpretation may also change, i.e. CHF cohort
  • 46. Domain Sense | Actionability of Explanations in Healthcare • Actionability is context dependent and role dependent • Actionability is NOT causality • There may be circumstances were actionability is not possible • Temporality associated with modifiable factors: flu vaccine is a 1 minute task vs. weight loss programs which can take months • Actionability does not presume causality but it can be used to devise interventions or to investigate novel associations If you take this medicine then you will be fine in 4 days. So if I take 4 times the dosage then I will be fine by tonight!
  • 47. Domain Sense | Narrative Roles • Normal evidence is evidence that is expected to be present in many instances predicted to be in the class. • With high importance , high effect is not surprising. • Exceptional evidence is evidence that is not usually expected to be present. • Contrarian evidence is strongly unexpected, since the effect has the opposite sign than expected. [ Biren 2017A ]
  • 48. Domain Sense | GAM (Generalized Additive Models) • GAMs combine single-feature models called shape functions through a linear function [Lou 2012] • Shape functions can be be arbitrarily complex • GAMs have more predictive power as compared to linear models
  • 49. Domain Sense | GA2M • Generalized Additive Models with pairwise interactions (GA2M) [Caruana 2015, Caruana 2017] • Accounts for interactions between the target variable and the feature space • Possible to visualize the relationship between the target variable to gain insights into why the prediction is being made
  • 50. Domain Sense | GA2M • There be a massive number of pairwise interactions between a large number of features e.g., n =1000 • GAMs reduces the number of pairwise interactions to be considered
  • 51. Audience Quiz: Domain Sense All Explanations in Healthcare AI are: A. Actionable B. Causal C. Role Based D. All of the above
  • 52. Consistency | LWBS Consistency The explanation should be consistent across different models and across different runs of the model LWBS Left without being seen refers to a patient leaving the facility without being seen by a physician Factors Prior LWBS Insurance status Past history Collins Katherine 37 ABDOMINAL PAIN Monroe Private 16:52 314 0 24 Kennedy Susan 72 DYSPNEA Lewis Medicare 19:45 135 1 60 Factors Temperature Insurance status Chief complaint Age ~18:00 ~22:00 Collins Katherine 37 ABDOMINAL PAIN Monroe Private 16:52 68 0 24 62 40 62 40
  • 53. Consistency • Dr. Monroe examines the reasons for a patient who is predicted to leave without being seen and notices that the explanation for them leaving are different from what he observed 4 hours ago • Upon investigation Katherine determines that the LIME model is being used for prediction which is non- deterministic, hence differences in explanations • Having different explanations for the same instance can be confusing for users, hence the need for consistency
  • 54. Consistency | Inconsistent Ranking • Patient: Harrison Welles: age 72, COPD & chronic liver disease • Use Case: Length of Stay Prediction Population Ranking LIME Shapley Values Min Arterial Pressure past 12 months Min Arterial Pressure past 12 months Min Arterial Pressure past 12 months HR in ED <60 or >110 Max ALT past 12 months Max ALT past 12 months Chronic Condition Dementia Min Sodium past 12 months Transfer Patient Temperature in ED <34 or >38 Max Neutrophils past 12 months Min Sodium past 12 months Chronic Condition Hyperlipidemia Median PaO2 : FiO2 past 12 months Max AST past 12 months Chronic Condition Anemia Most Recent Glucose Max Neutrophils past 12 months Weekend Admission Transfer Patient Mean Temperature past 12 months Chronic Condition Diabetes Chronic Condition Acute Myocardial Infarction Most Recent Glucose Chronic Condition Ischemic Heart Disease Chronic Condition Heart Failure Std dev PaO2 : FiO2 past 12 months Chronic Condition Acute Myocardial Infarction Chronic Condition Diabetes Median HbA1c past 12 months
  • 55. Consistency | Model Multiplicity vs Explanation Consistency • Given the same dataset, multiple machine learning algorithms can be constructed with similar performance • The explanations that are produced by multiple explainable algorithms should be very similar if not the same • Wide divergence in explanations is a sign of problem with explanations or with the algorithm(s) Variable Model A Model B Model C Age 1 1 5 Gender 2 4 6 Diabetic 3 5 1 Race 4 6 4 Smoker 5 2 3 Alcoholic 6 3 2 Variables for Length of Stay Prediction Ranked
  • 56. Consistency | Evaluating Consistency: Human Evaluation • Humans can evaluate quality of explanations across models • Does not scale well • Have humans provide explanations
  • 57. Consistency | Evaluating Consistency: Machine Evaluation • Kendall’s Tau [Kendall 1938] • Measure of rank correlation i.e., the similarity of the orderings of the data when ranked by each of the quantities • Values range from 1 to -1
  • 58. Consistency | Evaluating Consistency: Machine Evaluation • Kendall’ W: [Kendall 1939] • Where S is the sum of squared deviations: • Where the rank and the mean rank are given as:
  • 59. Consistency | Content Explanation Networks • CENs are deep networks that generate parameters for context-specific probabilistic graphical models which are further used for prediction and play the role of explanations [Al-Shedivat 2017A] • Prediction and explanations are a joint task • Local approximations align with the local decision boundaries
  • 60. Counterexamples | ED Arrivals Prediction • Wendy and Katherine are looking at ED arrivals prediction • Katherine shows the explanation factors to Wendy but the performance of the corresponding model is relatively low (Precision = 0.34, Recall = 0.46) • Wendy states that getting explanations for ED arrivals prediction is not very important, performance takes precedence
  • 61. Exceptions to Explanations | Near Perfect Performance • Models which have perfect, near perfect or or (always) better than human performance • Detection of Diabetic Retinopathy with near perfect precision and recall • Better than human detection of tumors in radiology • Pneumonia detection as good as humans but with scalability Gulshan 2016 Rajpurkar 2017, Mazurowski 2018
  • 62. Exceptions to Explanations | Theoretical Guarantees • PAC (Probably Approximately Correct) Learning • SVMs for Linearly Sparse data • Verification system for demonstrating that the output class of certain neural network is constant across a desired neighborhood [Pulina 2010] • Verification Systems for neural networks with certain assumptions e.g., for example, that only a subset of the hidden units in the neural network are relevant to each input I have given up convergence guarantees for lent. The Machine Learning Hipster
  • 63. Limits of Explanations “You can ask a human, but, you know, what cognitive psychologists have discovered is that when you ask a human you’re not really getting at the decision process. They make a decision first, and then you ask, and then they generate an explanation and that may not be the true explanation.” - Peter Norvig
  • 64. Explainability and Cognitive Limitations • Machine Learning is used in problems where the size of the data and/or the number of variables is too large for humans to analyze • What if the most parsimonious model is indeed too complex for humans to analyze or comprehend? • Ante-Hoc explanations may be impossible and post-hoc explanations would be incorrect • Explanations may not be possible in some cases
  • 65. Audience Quiz: Exceptions What is a scenario when you may not want explanations: A. Theoretical performance guarantees B. Better than human prediction performance C. Whenever I want D. 0.7 AUC E. Both A & B F. Both C & D
  • 66. #DEATHVSDATASCIENCE Break So far we have covered: • Motivation for Explanation in ML in healthcare • Transparency • Domain Sense • Consistency • Counter examples to explanation After the break: • Parsimony • Generalizability • Trust • Fidelity • Discussion
  • 67. Parsimony| Admission Disposition Parsimony • The explanation should be as simple as possible • Applies to both the complexity of the explanation and the number of features provided to explain Admission Disposition • Where in the hospital the patient should go once they are admitted • elevated standard deviation of albumin • elevated heart rate variability • low potassium • min HR in ED • max BMI Patient Risk Profiles Admissions Disposition Sandra Robinson: 54% Observation Unit
  • 68. Diagnostic Parsimony • Healthcare in Practice • Multiple competing hypothesis • Hypothesis are constantly revised • ”most commons” are termed so for a reason • Occam’s Razor • To derive a unifying diagnosis that can explain all of a patient’s symptoms • Hickam’s Dictum • A man can have as many diseases as he damn well pleases
  • 69. Parsimony • MDL (Minimum Description Length) and Occam’s Razor • Occam’s Razor in Machine Learning [Domingos 1999] • Occam’s First Razor • Occam’s Second Razor • Occam’s Razor in Explainable AI Learning • The simplest explanation is not always the best explanation
  • 70. Chatton’sAnti-Razor • “If three things are not enough to verify an affirmative proposition about things, a fourth must be added, and so on.” • The Heliocentric models were not better predictors at the time when they were first proposed • The Flat Earth is a simpler hypothesis than the round Earth model and is in concordance with a large number of observations
  • 71. Decision Lists | Falling Rule Lists • Order of rules determines order of classification • Probability of success decreases monotonically down the list • Example: patients would be stratified into risk sets and the highest at-risk patients should be considered first • Greedy tree based models and Bayesian models for list construction [Wang and Rudin, 2015]
  • 72. Bayesian Rule Lists • BRLs are decision lists--a series of if-then statements • BRLs discretize a high-dimensional, multivariate feature space into a series of simple, readily interpretable decision statements. • Experiments show that BRLs have predictive accuracy on par with the current top ML algorithms (approx. 85- 90% as effective) but with models that are much more interpretable [Letham 2015]
  • 73. Decision Lists | Interpretable Decision Sets • Independent of if-then rules • Allows Scrutability: Human feedback for correction • Frequent item-set item with constrains [Lakkaraju, Bach & Leskovec, 2016]
  • 74. Right for the Right Reasons Model • Models can be right for the wrong reasons [Ross 2017] • Use domain knowledge to constrain explanations • Training models with input gradient penalties [Lakkaraju, Bach & Leskovec, 2016]
  • 75. Generalizability | Models • Dr. Warren is Katherine’s hospitalist. He is looking at his patients’ length of stay prediction. He observes that many of the instances have similar explanations. • However the explanations do not appear to be generalizable to the whole population • When he looks into other cases and realizes that he had been looking at a specialized population ExplanationChief Complaint Dyspnea Anemia SOB Fatigue Sepsis Abd pain Fatigue CKD UTI EF<40% EF<40% EF<40% EF<40% EF<40% EF<40% EF<40% EF<40% EF<40%
  • 76. Generalizability | Models • Dr. Warren is Katherine’s hospitalist. He is looking at his patients’ length of stay prediction. He observes that many of the instances have similar explanations. • However the explanations do not appear to be generalizable to the whole population • When he looks into other cases and realizes that he had been looking at a specialized population
  • 77. Generalizability | Explanations • Local Models: Models that give explanations at the level of an instance • Examples: LIME, Shapley Values etc. • Cohort Level Models: A type of global models where the explanations are generated at the level of cohort • Examples: Same as global models • Global Models: Models that give explanations • Examples: Decision Trees, Rule Based Models etc.
  • 78. Generalizability | Algorithms • Are the interpretable/explainable algorithms generalizable to all predictive algorithm, a particular class of algorithms or tied to a particular algorithm • Model Agnostic Explanations: • Examples: LIME, Shapley Values etc. • Model Class Specific Explanations: • Examples: Tree Explainers, Gradient Boost Explainers • Model Specific Explanations: • Examples: CENs, Decision Trees, Random Forest Explainer etc.
  • 79. Generalizability | Dimensions of Explanations Data • What variables or features are most relevant for the prediction of length of stay? Prediction • Explain why certain patients are being predicted to have long length of stays Model • What are the patterns belonging to a particular category (long length of stay) typically look like?
  • 80. Prediction Decomposition • Explain the model prediction for one instance by measuring the difference between the original prediction and the one made with omitting a set of features [Robnik-Sikonja and Kononenko 2008] • Problem: Imputation may be needed in case of missing values
  • 81. Model Outputs Rule Based Models Relative Variable Importance Case Based Models Rules for Length of Stay Prediction IF age > 55 AND gender = male AND condition = ‘COPD’ AND complication = ‘YES’ THEN Length of stay = long (> 7 days) Top 5 Variables for Length of Stay Prediction Variable Importance Age 0.45 Gender 0.37 Diabetic 0.32 Race 0.21 Smoker 0.14 Example cases for Length of Stay Prediction Patient X is predicted to have a length of stay of 20 days because he is most similar to these 5 patients who on average had length of stay of 5 days Druzdzel 1996
  • 82. Audience Quiz: Parsimony Why would one not use the simplest model: A. It may not generalize B. I like complicated models C. The model is right for wrong reasons D. Both A & B E. Both B & C
  • 83. Trust/Performance | ICU Transfer Prediction Trust / Performance • The expectation that the corresponding predictive algorithm for explanations should have a certain performance ICU Transfer Prediction • Predict if a patient on the hospital ward will require transfer to the intensive care unit due to increasing acuity of care needs Steven Miller, Rm 436 low blood pressure increasing FiO2 requirement lethargy 81 DETERIORATION ALERT
  • 84. Trust | Prediction Performance • Expectation that the predictive system has a sufficiently high performance e.g., precision, recall, AUC etc. [Lipton 2016, Hill 2018] • Explanations accompanied with sub-par predictions can foster distrust • The model should perform sufficiently well on the prediction task in its intended use • Trauma patients: vital signs and lab criteria fulfill criteria to trigger alarm, leads to increasing numbers of false positives [Nguyen 2014]
  • 85. Trust | Human Performance Parity • The model has at least parity with the performance of human practitioners • The model makes the same types of mistakes that a human makes • Example: Model B has human performance parity Model Precision Recall F-Score Accuracy Physician’s Prediction 0.73 0.71 0.72 0.60 Model A 0.82 0.68 0.74 0.65 Model B 0.83 0.81 0.82 0.89 Results for Length of Stay Prediction (Long vs. Short Stays)
  • 86. Trust | Counterpoint to Human-Parity • Some models may not need parity if they do went on predicting for cases which human are good at predictions • The predictive system could be a hybrid of human and machine decision making Model Precision Recall F-Score Accuracy Physician’s Prediction 0.73 0.71 0.72 0.60 Model A 0.24 0.68 0.74 0.65 Model B 0.83 0.81 0.82 0.89 Results for Length of Stay Prediction (Long vs. Short Stays)
  • 87. Trust | Performance vs. Explainability Image Source: Easy Solutions
  • 88. Trust | Performance, Explanation and Risk Trade-off • Trade-off is between performance, Explanation and Risk • Problem with high risk like predicting end of life may need explanations because the cost of false positive may be very high • Low risk problems like cost prediction may not need explanations so optimization is centered our prediction performance Performance Risk
  • 89. Trust | Explaining Deep Learning [Craven 1996]
  • 90. Trust | Explanations vs. Associations • A CNN is trained to recognize objects in images • A language generating RNN is trained to translate features of the CNN into words and captions
  • 91. Trust | Explanations vs. Associations • Hendricks et al created a system to generate explanations of bird classifications. The system learns to: • Classify bird species with 85% accuracy • Associate image descriptions (discriminative features of the image) with class definitions (image-independent discriminative features of the class) • Limitations • Limited (indirect at best) explanation of internal logic • Limited utility for understanding classification errors
  • 92. Trust | Explainability andAdversarial ML [Szegedy et al., 2013, Ghorbani 2017]
  • 93. Fidelity| Risk of Readmission Fidelity • The expectation that the explanation and the predictive model align well with one another Risk of Readmission • Predict if the patient will be readmitted within a particular span in time, i.e. 30 days from time of discharge
  • 94. Fidelity • Dr. Warren examines the explanations for risk of readmission and discovers that the explanations do not appear to be correct • After ruling out non-determinism and lack of consistency as an explanation, Katherine examines the data • They discover there is problem with the data leading to incorrect explanations • Fixing the data does not solve the problem of incorrect explanations completely • Katherine discovers that the explanation models and the prediction models are different
  • 95. Fidelity | Data Provenance • The Explanation is going to be as good as the data • Pneumonia Mortality Risk Prediction Example • Low Quality data • Instrumentation problems • Censoring • Incorrect explanations may be given because of problems in the data • Constraints on data collect may also show up as constraints on the explanations • Shifts in distributions of underlying data must be recognized
  • 96. Fidelity | Models • Fidelity with the underlying phenomenon • Readmission model which uses lunar cycles as a feature: The model may have good predictive power and the feature may even be quite helpful but the model does not have fidelity with the underlying phenomenon
  • 97. Fidelity | Explanations • An explanation is Sound if it adheres to how the model actually works • An Explanation is Complete if it encompasses the complete extent of the model • Soundness and Completeness are relative • The soundness of an explanation can also vary depending upon on the constraints on the explanations • Ante-Hoc models are perfectly sound by definition, Post-Hoc models can have varying leveling of soundness
  • 98. Fidelity | Explanations • Fidelity with the prediction model i.e., explanations should align as close to the predictive model as possible (Soundness) • Ante-Hoc: Models where the predictive model and the explanation model is the same • Post-Hoc Models: : Models where the predictive model and the explanation model are different • Special Case: Mimic Models
  • 99. Fidelity | Mimic Models • Also known as Shadow, Surrogate or Student Models • Use the output (instead of the true labels) from the complex model and the training data to train an model which is explainable [Bucilua 2006, Tan 2017] • The performance of the student model is usually quite good • Example: Given a highly accurate SVM, train a decision tree on the predicted label of the SVM and the original data
  • 100. Fidelity | Mimic Models for Deep Learning
  • 101. Fidelity | Imputation and Explanations • Electronic health record data is often sparse • Imputation is often used to complete the necessary data fields • How should imputed fields affect model explanations? • Possible concerns with safety & user trust: • If the explanation for a readmission risk score includes the results of a test that a physician has not yet ordered, how will the physician react? • Could it affect patient safety? • If patient has a low readmission risk due to a ”low troponin” yet troponin has not been evaluated, the physician may overlook that fact and assume that an acute coronary event has already been ruled out [Saha et al. 2015]
  • 102. BETA • Black Box Explanation through Transparent Approximations • BETA learns a compact two-level decision set in which each rule explains part of the model behavior unambiguously • Novel objective function so that the learning process is optimized for: • High Fidelity (high agreement between explanation and the model) • Low Unambiguity (little overlaps between decision rules in the explanation) • High Interpretability (the explanation decision set is lightweight and small) [Lakkaraju, Bach & Leskovec, 2016]
  • 103. Interpreting Random Forests • For each decision that a tree (or a forest) makes there is a path (or paths) from the root of the tree to the leaf • The path consists of a series of decisions, corresponding to a particular feature, each of which contribute to the final predictions • Variable importance can be computed as the contribution of each node for a particular decision for all the decision trees in the random forest [Saabas 2014]
  • 104. Audience Quiz: Fidelity Ante-Hoc models are: A. Always Sound B. Mostly Sound C. Always Complete D. Mostly Complete E. Both A & C F. Both A & D G. Both B & C H. Both B & D
  • 105.
  • 106. Conclusion: Explainability in Healthcare AI Artificial Intelligence Explanations Assistive Intelligence + Clinical Expertise
  • 107. References Ahmad 2018 Muhammad Aurangzeb Ahmad, Carly Eckert, Greg McKelvey, Kiyana Zolfagar, Anam Zahid, Ankur Teredesai. Death vs. Data Science: Predicting End of Life IAAI February 2-6, 2018 Al-Shedivat 2017A Al-Shedivat, Maruan, Avinava Dubey, and Eric P. Xing. ”Contextual Explanation Networks.” arXiv preprint arXiv:1705.10301 (2017). Al-Shedivat 2017B Al-Shedivat, Maruan, Avinava Dubey, and Eric P. Xing. "The Intriguing Properties of Model Explanations." Bach 2015 Bach, Sebastian, et al. "On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation." PloS one 10.7 (2015): e0130140 Bayes 1763 Bayes, Thomas, Richard Price, and John Canton. "An essay towards solving a problem in the doctrine of chances." (1763): 370-418. Biran 2014 Biran, Or, and Kathleen McKeown. "Justification narratives for individual classifications." In Proceedings of the AutoML workshop at ICML, vol. 2014. 2014. Biran 2017A Biran, Or, and Kathleen R. McKeown. "Human-Centric Justification of Machine Learning Predictions." In IJCAI, pp. 1461-1467. 2017. Biran 2017B Biran, Or, and Courtenay Cotton. "Explanation and justification in machine learning: A survey." In IJCAI-17 Workshop on Explainable AI (XAI), p. 8. 2017. Bucilua 2006 Buciluǎ, Cristian, Rich Caruana, and Alexandru Niculescu-Mizil. "Model compression." In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 535-541. ACM, 2006. Caruna 2015 Caruana, Rich, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. "Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission." In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1721-1730. ACM, 2015. Caruna 2017 Caruana, Rich, Sarah Tan, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, Noemie Elhadad et al. "Interactive Machine Learning via Transparent Modeling: Putting Human Experts in the Driver’s Seat.” IDEA 2017 Chang 2009 Chang, Jonathan, et al. “Reading tea leaves: How humans interpret topic models.” Advances in Neural Information Processing Systems. 2009 Craik 1967 Craik, Kenneth James Williams. The nature of explanation. Vol. 445. CUP Archive, 1967. Craven 1996 Mark W. Craven and Jude W. Shavlik. Extracting tree-structured representations of trained networks. Advances in Neural Information Processing Systems, 1996. Datta 2016 Datta, Anupam, Shayak Sen, and Yair Zick. "Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems." Security and Privacy (SP), 2016 IEEE Symposium on. IEEE, 2016 Doshi-Velez 2014 Doshi-Velez, Finale, and Been Kim. "Towards a rigorous science of interpretable machine learning." arXiv preprint arXiv:1702.08608 (2017). Doshi-Velez 2017 Doshi-Velez, Finale, Mason Kortz, Ryan Budish, Chris Bavitz, Sam Gershman, David O'Brien, Stuart Schieber, James Waldo, David Weinberger, and Alexandra Wood. "Accountability of AI under the law: The role of explanation." arXiv preprint arXiv:1711.01134 (2017). Druzdzel 1996 Druzdzel, Marek J. "Qualitiative verbal explanations in bayesian belief networks." AISB QUARTERLY (1996): 43-54. Farah 2014 Farah, M.J. Brain images, babies, and bathwater: Critiquing critiques of functional neuroimaging. Interpreting Neuroimages: An Introduction to the Technology and Its Limits 45, S19-S30 (2014) Freitas 2014 Freitas, Alex A. "Comprehensible classification models: a position paper." ACM SIGKDD explorations newsletter 15, no. 1 (2014): 1-10.
  • 108. References Friedman 2001 Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning. Vol. 1. New York: Springer series in statistics, 2001. Friedman 2008 Friedman, Jerome H., and Bogdan E. Popescu. "Predictive learning via rule ensembles." The Annals of Applied Statistics2, no. 3 (2008): 916-954. Goldstein 2014 Goldstein, Alex, Adam Kapelner, Justin Bleich, and Emil Pitkin. "Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation." Journal of Computational and Graphical Statistics 24, no. 1 (2015): 44-65. Guidotti 2018 Guidotti, Riccardo, Anna Monreale, Franco Turini, Dino Pedreschi, and Fosca Giannotti. "A survey of methods for explaining black box models." arXiv preprint arXiv:1802.01933(2018). Gunning 2016 David Gunning Explainable Artificial Intelligence (XAI) DARPA/I2O 2016 Gorbunov 2011 Gorbunov, K. Yu, and Vassily A. Lyubetsky. "The tree nearest on average to a given set of trees." Problems of Information Transmission 47, no. 3 (2011): 274. Hall 2018 Hall, P., Gill, N., Kurka, M., Phan, W. (May 2018). Machine Learning Interpretability with H2O Driverless AI. http://docs.h2o.ai. Hardt 2016 Hardt, Moritz, Eric Price, and Nati Srebro. "Equality of opportunity in supervised learning." In Advances in neural information processing systems, pp. 3315-3323. 2016. Hastie 1990 T. Hastie and R. Tibshirani. Generalized additive models . Chapman and Hall/CRC, 1990. Herlocker 2000 Herlocker, Jonathan L., Joseph A. Konstan, and John Riedl. "Explaining collaborative filtering recommendations." In Proceedings of the 2000 ACM conference on Computer supported cooperative work, pp. 241-250. ACM, 2000. Hoffman 2017 Hoffman, Robert R., Shane T. Mueller, and Gary Klein. "Explaining Explanation, Part 2: Empirical Foundations." IEEE Intelligent Systems 32, no. 4 (2017): 78-86. Hutson 2018 Hutson, Matthew. "Has artificial intelligence become alchemy?." (2018): 478-478. Koh 2017 Koh, Pang Wei, and Percy Liang. "Understanding black-box predictions via influence functions." arXiv preprint arXiv:1703.04730 (2017).Harvard Kulesza 2014 Kulesza, Todd. "Personalizing machine learning systems with explanatory debugging." (2014). Lakkaraju 2016 Himabindu Lakkaraju, Stephen H. Bach, and Jure Leskovec. “Interpretable decision sets: A joint framework for description and prediction.” Proc. 22nd ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining. ACM, 2016. Lei 2017 Jing Lei, Max G’Sell, Alessandro Rinaldo, Ryan J. Tibshirani, and Larry Wasserman. Distribution-free predictive inference for regression. Journal of the American Statistical Association, 2017 Lipovetsky 2001 Lipovetsky, Stan, and Michael Conklin. "Analysis of regression in game theory approach." Applied Stochastic Models in Business and Industry 17.4 (2001): 319-330 Lipton 2016 Lipton, Zachary C. ”The mythos of model interpretability.” arXiv preprint arXiv:1606.03490 (2016). Lipton 2017 Lipton, Zachary C. "The Doctor Just Won't Accept That!." arXiv preprint arXiv:1711.08037 (2017). Lou 2013 Lou, Yin, Rich Caruana, Johannes Gehrke, and Giles Hooker. ”Accurate intelligible models with pairwise interactions.” In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 623-631. ACM, 2013. Lundberg 2017 Lundberg, Scott M., and Su-In Lee. "A unified approach to interpreting model predictions." In Advances in Neural Information Processing Systems, pp. 4765-4774. 2017. Miller 2017A Miller, Tim. ”Explanation in artificial intelligence: Insights from the social sciences.” arXiv preprint arXiv:1706.07269 (2017). Miller 2017B Miller, Tim, Piers Howe, and Liz Sonenberg. "Explainable AI: Beware of inmates running the asylum." In IJCAI-17 Workshop on Explainable AI (XAI), vol. 36. 2017. Montavon 2017 Montavon, Grégoire, Sebastian Lapuschkin, Alexander Binder, Wojciech Samek, and Klaus-Robert Müller. "Explaining nonlinear classification decisions with deep taylor decomposition." Pattern Recognition 65 (2017): 211-222.
  • 109. References Moosavi-Dezfooli 2016 Moosavi-Dezfooli, Seyed-Mohsen, Alhussein Fawzi, and Pascal Frossard. "Deepfool: a simple and accurate method to fool deep neural networks." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2574-2582. 2016. Morstatter 2016 Fred Morstatter and Huan Liu Measuring Topic Interpretability with Crowdsourcing KDD Nuggets November 2016 Nott 2017 Nott, George “Google's research chief questions value of 'Explainable AI'” Computer World 23 June, 2017 Olah 2018 Olah, Chris, Arvind Satyanarayan, Ian Johnson, Shan Carter, Ludwig Schubert, Katherine Ye, and Alexander Mordvintsev. "The building blocks of interpretability." Distill 3, no. 3 (2018): e10. Perez 2004 Pérez, Jesus Maria, Javier Muguerza, Olatz Arbelaitz, and Ibai Gurrutxaga. "A new algorithm to build consolidated trees: study of the error rate and steadiness." In Intelligent Information Processing and Web Mining, pp. 79-88. Springer, Berlin, Heidelberg, 2004. Quinlan, J. Ross. "Some elements of machine learning." In International Conference on Inductive Logic Programming, pp. 15-18. Springer, Berlin, Heidelberg, 1999. Ras 2018 Ras, Gabrielle, Pim Haselager, and Marcel van Gerven. "Explanation Methods in Deep Learning: Users, Values, Concerns and Challenges." arXiv preprint arXiv:1803.07517(2018). Ribeiro 2016a Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. ”Why should i trust you?: Explaining the predictions of any classifier.” In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135-1144. ACM, 2016. Ribeiro 2016b Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin Introduction to Local Interpretable Model-Agnostic Explanations (LIME) August 12, 2016 Oriely Media Ross 2018 Ross, Casey IBM’s Watson supercomputer recommended ‘unsafe and incorrect’ cancer treatments, internal documents show Stat News July 25, 2018 Ross 2017 Ross, Andrew Slavin, Michael C. Hughes, and Finale Doshi-Velez. "Right for the right reasons: Training differentiable models by constraining their explanations." arXiv preprint arXiv:1703.03717 (2017). Saabas 2014 Saabas, Ando. Interpreting random forests Data Dive Blog 2014 Shrikumar 2017 Shrikumar, Avanti, Peyton Greenside, and Anshul Kundaje. "Learning important features through propagating activation differences." arXiv preprint arXiv:1704.02685 (2017) Strumbelj 2014 Strumbelj, Erik, and Igor Kononenko. "Explaining prediction models and individual predictions with feature contributions." Knowledge and information systems 41.3 (2014): 647-665. Tan 2017 Tan, Sarah, Rich Caruana, Giles Hooker, and Yin Lou. "Detecting Bias in Black-Box Models Using Transparent Model Distillation." arXiv preprint arXiv:1710.06169 (2017). Turing 1950 Machinery, Computing. "Computing machinery and intelligence-AM Turing." Mind 59, no. 236 (1950): 433. Ustun 2016 Ustun, Berk, and Cynthia Rudin. ”Supersparse linear integer models for optimized medical scoring systems.” Machine Learning 102, no. 3 (2016):349-391. Wang 2015 Wang, Fulton, and Cynthia Rudin. ”Falling rule lists.” In Artificial Intelligence and Statistics, pp. 1013-1022. 2015. Weld 2018 Weld, Daniel S., and Gagan Bansal. "Intelligible Artificial Intelligence." arXiv preprint arXiv:1803.04263 (2018). Weller 2017 Weller, Adrian. "Challenges for transparency." arXiv preprint arXiv:1708.01870 (2017). Wick 1992 M. R. Wick and W. B. Thompson. Reconstructive expert system explanation. Artificial Intelligence, 54(1- 2):33–70, 1992 Yang 2016 Yang, Hongyu, Cynthia Rudin, and Margo Seltzer. ”Scalable Bayesian rule lists.” arXiv preprint arXiv:1602.08610 (2016).

Notas del editor

  1. http://bmjopen.bmj.com/content/5/8/e008657.full
  2. Emergency departments evaluate and manage hundreds of patients each day. Busy EDs are the norm around the world. The pressure is on ED physicians and administrators to quickly make decisions regarding patient treatment and disposition ML algorithms can assist providers in
  3. Visual with EHR based patient care, 62% risk score and associated factors
  4. Proper staffing and proper stocking of supplies can be done with relevant explanations
  5. With low importance, high effect means that the feature value is surprisingly high. The user may decide that this makes the prediction dubious (especially if the result would be different if this feature were within the normal range), or on the contrary - perhaps this feature represents rare but solid causal evidence.  Note that contrarian evidence (and contrarian counter-evidence) is only possible for features that may have negative values.
  6. Although we can extract explanations from deep learning systems, the explanations may not be correct because it is possible to fool DL systems which have proven to be highly accurate otherwise
  7. The student model use the same set of features as the original model In practice the student model is deployed
  8. High fidelity =
  9. High Fidelity =