1. Cost of Medical Care
in Older Adults with Chronic Conditions
Randal S. Goomer, PhD
Data Science
DS-SF-29
General Assembly
San Francisco, 1/25/2017
Randal S. Goomer, PhD
2. DataSet
Medicare 2010 Patient records
Expunged of all personal or identifiable information (33M patient profiles)
Provided by
CMS
As: 2010 Chronic Conditions Public Use File (PUF)
(2010 CMS CC PUF)
Patient Age
Categories: 1 – 6
1. 62-64
2. 65-69
3. 70-74
4. 75-79
5. 80-85
6. 85----
Therapy Behavior:
• Number of Out-
Patient Visits
• Number of In-
Patient admits
• Medicare Part A,
B, C, D, E
payments
Chronic Conditions CC
• Alzhiemers
• Cancer
• CHF
• Diabetes
• ChrKidneyDisease
• Stroke
• Osteoporosis
• Depression
• COPD
• Ischemic Heart Condition
• Stroke
• Arthritis
Patient Gender
Cost
(Payouts by Medicare)
Randal S. Goomer, PhD
3. Dataset Includes a 4-page Dictionary of Terms
(Sample: page 1 of 4)
Randal S. Goomer, PhD
4. Can we predict costs based on patient profile or behavior?
BI Questions
• Does the type of chronic conditions (CC) impact Costs?
• Does age or gender impact cost?
• Does patient behavior such as accessing OP facilities vs. IP
admits influence costs?
• Which behavior costs more of less?
• Which CC cost more or less?
Ho: <<we cannot predict cost from patient profile and behavior>>
H1: <<Patient profile and behavior can predict cost>>
Randal S. Goomer, PhD
5. ML models
• Data Munging (pd, np)
• Random Forest
• RF optimized by bootstrapping (with replacement)
• and OOB error testing against ROC/AUC
• OLS (p-val, r-squared, coeff., predict accuracy)
• Logit regression (coeff_, curve_fitting, Predict Prob.)
• K-means classification (with K_fold-grid_CV optimization)
• Visualizations (Seaborn, MatplotLib)
Randal S. Goomer, PhD
6. Heatmap:
Order of importance
w.r.t. ‘payout’ or Cost:
- ip_admit = (hospitalization)
- op_visits = (offices/op-clinics)
- CC_CHF = (Chronic Heart Failure)
- CC_CANCER = (Cancer patients)
- CC_ISCHMCHT = (Ischemic Heart Dis)
- CC_CHRNKIDN = (Chronic kidney Dis)
After Detailed Data ‘Munging’, Heatmap was produced
Heatmap finds hidden correlations
Chronic Kidney Dis.
Osteoporosis
Randal S. Goomer, PhD
7. Cost v. In-Patient Admit
EDA: Costs Rise Quickly for number of In-Patient Admits
Payout >>
In-PatientAdmits
Randal S. Goomer, PhD
8. EDA: Costs Plateau out quickly for number of Out-Patient visits
Out-PatientVisits
Payout >>
Cost v. Out-Patient Visit
Randal S. Goomer, PhD
9. Patients with Chronic Heart Failure (CHF) by Age v. Cost
Patient age and Chronic Condition contributes to Cost
Payout>>
(62 yrs 85+ yrs) (62 yrs 85+ yrs)
CHF = TrueCHF = False
Randal S. Goomer, PhD
10. Age v. IP Admit and OP visitsIn-PatientAdmit
In-Patient Admits
Out-PatientVisits(62 yrs 85+ yrs) (62 yrs 85+ yrs)Randal S. Goomer, PhD
23. Random Forest: Trained n_estimators using bootstrap from 30 to
2,000 Trees using AUC as output; Optimized at 1,000 trees
Randal S. Goomer, PhD
24. #### the above shows stats when using all features (auto and
none), sqrt of features numbers (100 features == 10 used),
90%, or 20%.
Random Forest: optimized max-features options, using bootstrap from
using AUC as output; ‘auto’, sqrt, log2, 0.9, 0.2 used.
Randal S. Goomer, PhD