SlideShare una empresa de Scribd logo
1 de 62
Analysis of a Binary Outcome Variable Using the FREQ and  the LOGISTIC Procedures Arthur Li
INTRODUCTION ,[object Object],Outcome (Y) Exposure (X) (smoking) (cancer) Exposure (X1) (age) Exposure (X2) (gender) ,[object Object],[object Object]
CONTINGENCY TABLE ,[object Object],[object Object],[object Object],[object Object],[object Object],741 38 NEVER 927 131 CURRENT SMOKING  STATUS NORMAL ABNORMAL BREATHING TEST
STUDY DESIGN ,[object Object],[object Object],[object Object],[object Object],[object Object],P 1 =   A A+B P 0 =   C C+D D C 0 B A 1 Exposure (X) 0 1 Outcome (Y)
ODDS RATIO A Odds 1  =   B   Odds 0  =   C D D C 0 B A 1 Exposure (X) 0 1 Outcome (Y) Odds   Ratio =   Odds 1 Odds 0 AD BC =
ODDS RATIO ,[object Object],OR = 1    No Association OR > 1    Exposed Group (X = 1) has higher odds  OR < 1    Non-exposed Group (X = 0) has higher odds D C 0 B A 1 Exposure (X) 0 1 Outcome (Y) 0 1 infinity
ODDS RATIO 0 1 infinity ,[object Object],[object Object],[object Object],[object Object],D C 0 B A 1 Exposure (X) 0 1 Outcome (Y)
PROC FREQ data  breathTest; input  test $  1 - 8  neversmk $  10 - 16   count; datalines ; abnormal current 131 normal  current 927 abnormal never  38 normal  never  741 ; 741 (D) 38 (C) NEVER (0) 927 (B) 131 (A) CURRENT (1) SMOKING  STATUS (X) NORMAL (0) ABNORMAL (1) BREATHING TEST (Y)
PROC FREQ proc   freq   data =breathTest; weight  count; tables  neversmk*test; run ; the data is entered directly from the cell count of the table  The FREQ Procedure Table of neversmk by test neversmk  test Frequency‚ Percent  ‚ Row Pct  ‚ Col Pct  ‚abnormal‚normal  ‚  Total ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ current  ‚  131 ‚  927 ‚  1058 ‚  7.13 ‚  50.46 ‚  57.59 ‚  12.38 ‚  87.62 ‚ ‚  77.51 ‚  55.58 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ never  ‚  38 ‚  741 ‚  779 ‚  2.07 ‚  40.34 ‚  42.41 ‚  4.88 ‚  95.12 ‚ ‚  22.49 ‚  44.42 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total  169  1668  1837 9.20  90.80  100.00
PROC FREQ - RELRISK proc   freq   data =breathTest; weight  count; tables  neversmk*test/ relrisk ; run ; ,[object Object],[object Object],[object Object],[object Object],col1 col2 741 (D) 38 (C) NEVER (0) 927 (B) 131 (A) CURRENT (1) SMOKING  STATUS (X) NORMAL (0) ABNORMAL (1) BREATHING TEST (Y)
PROC FREQ - RELRISK proc   freq   data =breathTest; weight  count; tables  neversmk*test/ relrisk ; run ; Estimates of the Relative Risk (Row1/Row2) Type of Study  Value  95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control (Odds Ratio)  2.7557  1.8962  4.0047 Cohort (Col1 Risk)  2.5383  1.7904  3.5987 Cohort (Col2 Risk)  0.9211  0.8960  0.9470 Sample Size = 1837 ,[object Object],[object Object],[object Object],[object Object],Odds of having an abnormal test result are about 2.8 times higher for current smokers compared to those who have never smoked (95% CI: 1.9 – 4.0).
PROC FREQ - CHISQ proc   freq   data =breathTest; weight  count; tables  neversmk*test/ relrisk chisq ; run ; Statistics for Table of neversmk by test Statistic  DF  Value  Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square  1  30.2421  <.0001 Likelihood Ratio Chi-Square  1  32.3820  <.0001 Continuity Adj. Chi-Square  1  29.3505  <.0001 Mantel-Haenszel Chi-Square  1  30.2257  <.0001 Phi Coefficient  0.1283 Contingency Coefficient  0.1273 Cramer's V  0.1283
LOGISTIC REGRESSION MODEL ,[object Object],[object Object],[object Object],[object Object],[object Object]
LOGISTIC REGRESSION MODEL ,[object Object]
LOGISTIC REGRESSION MODEL Reference cell coding β: the increment in log odds for current smokers compared to those that never smoked 741 38 NEVER 927 131 CURRENT SMOKING  STATUS NORMAL ABNORMAL BREATHING TEST
LOGISTIC REGRESSION MODEL proc   logistic   data =breathTest; class  neversmk / param =ref; weight  count; model  test = neversmk; run ; The LOGISTIC Procedure Model Information Data Set  WORK.BREATHTEST Response Variable  test Number of Response Levels  2 Weight Variable  count Model  binary logit Optimization Technique  Fisher's scoring Number of Observations Read  4 Number of Observations Used  4 Sum of Weights Read  1837 Sum of Weights Used  1837
LOGISTIC REGRESSION MODEL proc   logistic   data =breathTest; class  neversmk / param =ref; weight  count; model  test = neversmk; run ; Response Profile Ordered  Total  Total Value  test  Frequency  Weight 1  abnormal  2  169.0000 2  normal  2  1668.0000 Probability modeled is test='abnormal'. ,[object Object]
LOGISTIC REGRESSION MODEL proc   logistic   data =breathTest  descending ; class  neversmk / param =ref; weight  count; model  test = neversmk; run ; ,[object Object],proc   logistic   data =breathTest; class  neversmk / param =ref; weight  count; model  test ( descending ) = neversmk; run ; proc   logistic   data =breathTest; class  neversmk / param =ref; weight  count; model  test ( event = &quot;normal&quot; ) = neversmk; run ;
LOGISTIC REGRESSION MODEL proc   logistic   data =breathTest; class  neversmk / param =ref; weight  count; model  test = neversmk; run ; Class Level Information Design Class  Value  Variables neversmk  current  1 never  0 ,[object Object],[object Object],Reference Cell Coding
LOGISTIC REGRESSION MODEL proc   logistic   data =breathTest; class  neversmk; weight  count; model  test = neversmk; run ; Class Level Information Design Class  Value  Variables neversmk  current  1 never  -1 ,[object Object],Effect Coding
LOGISTIC REGRESSION MODEL proc   logistic   data =breathTest; class  neversmk / param =ref; weight  count; model  test = neversmk; run ; Class Level Information Design Class  Value  Variables neversmk  current  1 never  0 ,[object Object]
LOGISTIC REGRESSION MODEL proc   logistic   data =breathTest; class  neversmk ( ref = &quot;never&quot; ) / param =ref; weight  count; model  test = neversmk; run ; Model Fit Statistics Intercept Intercept  and Criterion  Only  Covariates AIC  1130.417  1100.035 SC  1129.803  1098.808 -2 Log L  1128.417  1096.035 Testing Global Null Hypothesis: BETA=0 Test  Chi-Square  DF  Pr > ChiSq Likelihood Ratio  32.3820  1  <.0001 Score  30.2421  1  <.0001 Wald  28.2434  1  <.0001 ,[object Object],[object Object]
LOGISTIC REGRESSION MODEL proc   logistic   data =breathTest; class  neversmk ( ref = &quot;never&quot; ) / param =ref; weight  count; model  test = neversmk; run ; Model Fit Statistics Intercept Intercept  and Criterion  Only  Covariates AIC  1130.417  1100.035 SC  1129.803  1098.808 -2 Log L  1128.417  1096.035 Testing Global Null Hypothesis: BETA=0 Test  Chi-Square  DF  Pr > ChiSq Likelihood Ratio  32.3820  1  <.0001 Score  30.2421  1  <.0001 Wald  28.2434  1  <.0001 ,[object Object],[object Object]
LOGISTIC REGRESSION MODEL proc   logistic   data =breathTest; class  neversmk ( ref = &quot;never&quot; ) / param =ref; weight  count; model  test = neversmk; run ; Model Fit Statistics Intercept Intercept  and Criterion  Only  Covariates AIC  1130.417  1100.035 SC  1129.803  1098.808 -2 Log L  1128.417  1096.035 Testing Global Null Hypothesis: BETA=0 Test  Chi-Square  DF  Pr > ChiSq Likelihood Ratio  32.3820  1  <.0001 Score  30.2421  1  <.0001 Wald  28.2434  1  <.0001 ,[object Object],[object Object]
LOGISTIC REGRESSION MODEL proc   logistic   data =breathTest; class  neversmk ( ref = &quot;never&quot; ) / param =ref; weight  count; model  test = neversmk; run ; Type 3 Analysis of Effects Wald Effect  DF  Chi-Square  Pr > ChiSq neversmk  1  28.2434  <.0001 Analysis of Maximum Likelihood Estimates Standard  Wald Parameter  DF  Estimate  Error  Chi-Square  Pr > ChiSq Intercept  1  -2.9704  0.1663  318.9365  <.0001 neversmk  current  1  1.0136  0.1907  28.2434  <.0001 ,[object Object]
LOGISTIC REGRESSION MODEL proc   logistic   data =breathTest; class  neversmk ( ref = &quot;never&quot; ) / param =ref; weight  count; model  test = neversmk; run ; Type 3 Analysis of Effects Wald Effect  DF  Chi-Square  Pr > ChiSq neversmk  1  28.2434  <.0001 Analysis of Maximum Likelihood Estimates Standard  Wald Parameter  DF  Estimate  Error  Chi-Square  Pr > ChiSq Intercept  1  -2.9704  0.1663  318.9365  <.0001 neversmk  current  1  1.0136  0.1907  28.2434  <.0001 Current smoker has 1.01 increase in the log odds of having abnormal test compared to people who never smoked OR = exp(1.0136) = 2.756
LOGISTIC REGRESSION MODEL proc   logistic   data =breathTest; class  neversmk ( ref = &quot;never&quot; ) / param =ref; weight  count; model  test = neversmk; run ; Odds Ratio Estimates Point  95% Wald Effect  Estimate  Confidence Limits neversmk current vs never  2.756  1.896  4.004 Estimates of the Relative Risk (Row1/Row2) Type of Study  Value  95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control (Odds Ratio)  2.7557  1.8962  4.0047 Cohort (Col1 Risk)  2.5383  1.7904  3.5987 Cohort (Col2 Risk)  0.9211  0.8960  0.9470 Sample Size = 1837 Result from PROC FREQ:
LOGISTIC REGRESSION MODEL proc   logistic   data =breathTest; class  neversmk ( ref = &quot;never&quot; ) / param =ref; weight  count; model  test = neversmk; oddsratio   'smoking'  neversmk; run ; ODDSRATIO <‘label’> variable </options>; new to 9.2! Wald Confidence Interval for Odds Ratios Label  Estimate  95% Confidence Limits smoking  neversmk current vs never  2.756  1.896  4.004
LOGISTIC REGRESSION MODEL proc   logistic   data =breathTest; class  neversmk ( ref = &quot;never&quot; ) / param =ref; weight  count; model  test = neversmk; oddsratio   'smoking'  neversmk/ cl =pl; run ; Profile Likelihood Confidence Interval for Odds Ratios Label  Estimate  95% Confidence Limits smoking  neversmk current vs never  2.756  1.916  4.054 ,[object Object],[object Object],[object Object]
CONFOUNDING Smoking Test Age Not including Age can cause either over-/under-estimates of the relationship between Smoking & Test
CONFOUNDING Log (odds) Non smoker smoker Smoking Test Age Adjusting age, you are comparing smoker and non-smoker at the common values of age Age Non smoker Non smoker smoker smoker < 40   ≥   40
INTERACTION ,[object Object],Age is referred to as an effect modifier Age Non smoker Non smoker smoker smoker < 40   ≥   40   Log (odds)
INTERACTION & CONFOUNDING ,[object Object],[object Object]
THE PURPOSES AND STRATEGIES FOR MODEL BUILDING ,[object Object],[object Object],[object Object],[object Object]
THE PURPOSES AND STRATEGIES FOR MODEL BUILDING ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
THE PURPOSES AND STRATEGIES FOR MODEL BUILDING ,[object Object],[object Object]
THE PURPOSES AND STRATEGIES FOR MODEL BUILDING Is the association between “Smoking” & “Test” different in the 2 age groups? There is an interaction.  Report age-specific OR No Interaction. Is “Age” a confounder? Report Crude OR Report Age-Adjusted OR Y N Y N
THE PURPOSES AND STRATEGIES FOR MODEL BUILDING ,[object Object],[object Object],0.01 0.2 <0.05 P MAYBE YES Include? 2.4 Z X Y 4.2 Z X Y 2.3 OR X Y Covariate Main Var Outcome
PROC FREQ: INTERACTION EFFECT data  breathTestAge; input  test $  1 - 8  neversmk $  10 - 16  over40 $  18 - 20  count; datalines ; normal  never  no  577 abnormal never  no  34 normal  current no  682 abnormal current no  57 normal  never  yes 164 abnormal never  yes 4 normal  current yes 245 abnormal current yes 74 ;
PROC FREQ: INTERACTION EFFECT proc   freq   data =breathTestAge; weight  count; tables  over40*neversmk*test/ chisq   relrisk   cmh ; run ; ,[object Object],[object Object],[object Object],The CMH option:
PROC FREQ: INTERACTION EFFECT proc   freq   data =breathTestAge; weight  count; tables  over40*neversmk*test/ chisq   relrisk   cmh ; run ; Breslow-Day Test for Homogeneity of the Odds Ratios ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square  18.0829 DF  1 Pr > ChiSq  <.0001 Total Sample Size = 1837 the association between smoking status and the breathing test are not the same across different age groups
PROC FREQ: INTERACTION EFFECT proc   freq   data =breathTestAge; weight  count; tables  over40*neversmk*test/ chisq   relrisk   cmh ; run ; Statistics for Table 1 of neversmk by test Controlling for over40=no Statistic  DF  Value  Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square  1  2.4559  0.1171 Likelihood Ratio Chi-Square  1  2.4893  0.1146 Continuity Adj. Chi-Square  1  2.1260  0.1448 Mantel-Haenszel Chi-Square  1  2.4541  0.1172 Phi Coefficient  0.0427 Contingency Coefficient  0.0426 Cramer's V  0.0427 Statistics for Table 1 of neversmk by test Controlling for over40=no Estimates of the Relative Risk (Row1/Row2) Type of Study  Value  95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control (Odds Ratio)  1.4184  0.9144  2.2000 Cohort (Col1 Risk)  1.3861  0.9190  2.0906 Cohort (Col2 Risk)  0.9772  0.9499  1.0054 Sample Size = 1350
PROC FREQ: INTERACTION EFFECT proc   freq   data =breathTestAge; weight  count; tables  over40*neversmk*test/ chisq   relrisk   cmh ; run ; Statistics for Table 2 of neversmk by test Controlling for over40=yes Statistic  DF  Value  Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square  1  35.4510  <.0001 Likelihood Ratio Chi-Square  1  45.1246  <.0001 Continuity Adj. Chi-Square  1  33.9203  <.0001 Mantel-Haenszel Chi-Square  1  35.3782  <.0001 Phi Coefficient  0.2698 Contingency Coefficient  0.2605 Cramer's V  0.2698 Estimates of the Relative Risk (Row1/Row2) Type of Study  Value  95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control (Odds Ratio)  12.3837  4.4416  34.5272 Cohort (Col1 Risk)  9.7429  3.6253  26.1844 Cohort (Col2 Risk)  0.7868  0.7374  0.8394
PROC FREQ: INTERACTION EFFECT proc   freq   data =breathTestAge; weight  count; tables  over40*neversmk*test/ chisq   relrisk   cmh ; run ; Summary Statistics for neversmk by test Controlling for over40 Cochran-Mantel-Haenszel Statistics (Based on Table Scores) Statistic  Alternative Hypothesis  DF  Value  Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1  Nonzero Correlation  1  25.2444  <.0001 2  Row Mean Scores Differ  1  25.2444  <.0001 3  General Association  1  25.2444  <.0001 Estimates of the Common Relative Risk (Row1/Row2) Type of Study  Method  Value  95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control  Mantel-Haenszel  2.5683  1.7618  3.7441 (Odds Ratio)  Logit  1.9840  1.3252  2.9702 Cohort  Mantel-Haenszel  2.4174  1.6754  3.4879 (Col1 Risk)  Logit  1.8475  1.2641  2.7001 Cohort  Mantel-Haenszel  0.9289  0.9046  0.9538 (Col2 Risk)  Logit  0.9437  0.9195  0.9686 These statistics and its adjusted OR are only useful if there is a homogeneity in the OR across each category of the adjusting variable
PROC LOGISTIC: INTERACTION EFFECT proc   logistic   data =breathTestAge; class  neversmk ( ref = &quot;never&quot; ) over40 ( ref = &quot;no&quot; )/ param =ref; weight  count; model  test = neversmk over40 neversmk*over40; run ;
PROC LOGISTIC: INTERACTION EFFECT proc   logistic   data =breathTestAge; class  neversmk ( ref = &quot;never&quot; ) over40 ( ref = &quot;no&quot; )/ param =ref; weight  count; model  test = neversmk over40 neversmk*over40; run ; Analysis of Maximum Likelihood Estimates Standard  Wald Parameter  DF  Estimate  Error  Chi-Square  Pr > ChiSq Intercept  1  -2.8315  0.1765  257.4193  <.0001 neversmk  current  1  0.3495  0.2240  2.4355  0.1186 over40  yes  1  -0.8820  0.5359  2.7086  0.0998 neversmk*over40 current yes  1  2.1668  0.5691  14.4985  0.0001 Wald Test:
PROC LOGISTIC: INTERACTION EFFECT proc   logistic   data =breathTestAge; class  neversmk ( ref = &quot;never&quot; ) over40 ( ref = &quot;no&quot; )/ param =ref; weight  count; model  test = neversmk over40 neversmk*over40; run ; Likelihood Ratio Test:
PROC LOGISTIC: INTERACTION EFFECT proc   logistic   data =breathTestAge; class  neversmk ( ref = &quot;never&quot; ) over40 ( ref = &quot;no&quot; )/ param =ref; weight  count; model  test = neversmk over40 neversmk*over40; run ; Model Fit Statistics Intercept Intercept  and Criterion  Only  Covariates AIC  1130.417  1055.467 SC  1130.497  1055.785 -2 Log L  1128.417  1047.467 Testing Global Null Hypothesis: BETA=0 Test  Chi-Square  DF  Pr > ChiSq Likelihood Ratio  80.9500  3  <.0001 Score  95.7956  3  <.0001 Wald  81.3305  3  <.0001
PROC LOGISTIC: INTERACTION EFFECT proc   logistic   data =breathTestAge; class  neversmk ( ref = &quot;never&quot; ) over40 ( ref = &quot;no&quot; )/ param =ref; weight  count; model  test = neversmk over40; run ; Model Fit Statistics Intercept Intercept  and Criterion  Only  Covariates AIC  1130.417  1074.123 SC  1130.497  1074.361 -2 Log L  1128.417  1068.123 Testing Global Null Hypothesis: BETA=0 Test  Chi-Square  DF  Pr > ChiSq Likelihood Ratio  60.2942  2  <.0001 Score  61.2515  2  <.0001 Wald  56.4737  2  <.0001
PROC LOGISTIC: INTERACTION EFFECT proc   logistic   data =breathTestAge; class  neversmk ( ref = &quot;never&quot; ) over40 ( ref = &quot;no&quot; )/ param =ref; weight  count; model  test = neversmk over40 neversmk*over40; ods   output  FitStatistics = log2Ratio_full  GlobalTests = df_full; data   _null_ ; set  log2Ratio_full; if   Criterion =  '-2 Log L' ; call  symput( 'neg2L_full' , InterceptAndCovariates); data   _null_ ; set  df_full; if   Test =  'Likelihood Ratio' ; call  symput( 'df_full' , DF);
PROC LOGISTIC: INTERACTION EFFECT proc   logistic   data =breathTestAge; class  neversmk ( ref = &quot;never&quot; ) over40 ( ref = &quot;no&quot; )/ param =ref; weight  count; model  test = neversmk over40; ods   output  FitStatistics = log2Ratio_reduce  GlobalTests = df_reduce; data   _null_ ; set  log2Ratio_reduce; if   Criterion =  '-2 Log L' ; call  symput( 'neg2L_reduce' , InterceptAndCovariates); data   _null_ ; set  df_reduce; if   Test =  'Likelihood Ratio' ; call  symput( 'df_reduce' , DF); run ;
PROC LOGISTIC: INTERACTION EFFECT data  result; LR = &neg2L_reduce - &neg2L_full; df = &df_full - &df_reduce; p =  1 -probchi(LR,df); label  LR =  'Likelihood Ratio' ; proc   print   data =result  label noobs ; title   &quot;Likelihood ratio test&quot; ; run ; Likelihood ratio test  Likelihood Ratio  df  p 20.6558  1  .000005497
PROC LOGISTIC: INTERACTION EFFECT proc   logistic   data =breathTestAge; class  neversmk ( ref = &quot;never&quot; ) over40 ( ref = &quot;no&quot; )/ param =ref; weight  count; model  test = neversmk over40 neversmk*over40; oddsratio  neversmk/  at  (over40 = 'no' ) ; oddsratio  neversmk/  at  (over40 = 'yes' ); run ; Wald Confidence Interval for Odds Ratios Label  Estimate  95% Confidence Limits neversmk current vs never at over40=no  1.418  0.914  2.200 neversmk current vs never at over40=yes  12.383  4.441  34.525
NURSE HEALTH STUDY  ,[object Object],[object Object]
NURSE HEALTH STUDY  data  nurse_study; input  bc age oc count; datalines ; 1 0 1 71 0 0 1 28418 1 0 0 35 0 0 0 12267 1 1 1 143 0 1 1 20661 1 1 0 321 0 1 0 44424 ; BREAST CANCER  35 71 CASE (1) AGE 30 – 39 (0) 12267 28418 CONTROL (0) 44424 321 NO (0) 20651 143 YES (1) OC  USE CONTROL (0) CASE (1) AGE 40 – 55 (1)
NURSE HEALTH STUDY  proc   freq   data =nurse_study  order =data; weight  count; tables  age*oc*bc/ chisq   relrisk   cmh ; run ; Breslow-Day Test for Homogeneity of the Odds Ratios ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square  0.1521 DF  1 Pr > ChiSq  0.6966 There is no interaction Check for confounding
NURSE HEALTH STUDY  Summary Statistics for oc by bc Controlling for age Cochran-Mantel-Haenszel Statistics (Based on Table Scores) Statistic  Alternative Hypothesis  DF  Value  Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1  Nonzero Correlation  1  0.4361  0.5090 2  Row Mean Scores Differ  1  0.4361  0.5090 3  General Association  1  0.4361  0.5090 Estimates of the Common Relative Risk (Row1/Row2) Type of Study  Method  Value  95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control  Mantel-Haenszel  0.9419  0.7882  1.1256 (Odds Ratio)  Logit  0.9415  0.7882  1.1246 Cohort  Mantel-Haenszel  0.9422  0.7897  1.1243 (Col1 Risk)  Logit  0.9419  0.7894  1.1238 Cohort  Mantel-Haenszel  1.0003  0.9994  1.0013 (Col2 Risk)  Logit  1.0003  0.9995  1.0012
NURSE HEALTH STUDY  proc   freq   data =nurse_study  order =data; weight  count; tables  oc*bc/ chisq   relrisk ; run ; Statistics for Table of oc by bc Statistic  DF  Value  Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square  1  17.8881  <.0001 Likelihood Ratio Chi-Square  1  18.1401  <.0001 Continuity Adj. Chi-Square  1  17.5337  <.0001 Mantel-Haenszel Chi-Square  1  17.8879  <.0001 Phi Coefficient  -0.0130 Contingency Coefficient  0.0130 Cramer's V  -0.0130 Statistics for Table of oc by bc Estimates of the Relative Risk (Row1/Row2) Type of Study  Value  95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control (Odds Ratio)  0.6944  0.5858  0.8230 Cohort (Col1 Risk)  0.6957  0.5874  0.8239 Cohort (Col2 Risk)  1.0019  1.0010  1.0028
NURSE HEALTH STUDY  ,[object Object],[object Object],[object Object]
NURSE HEALTH STUDY  proc   logistic   data =nurse_study  descending ; weight  count; model  bc = oc age; run ; Analysis of Maximum Likelihood Estimates Standard  Wald Parameter  DF  Estimate  Error  Chi-Square  Pr > ChiSq Intercept  1  -5.9083  0.1156  2612.5788  <.0001 oc  1  -0.0602  0.0911  0.4360  0.5090 age  1  0.9835  0.1133  75.3707  <.0001 Odds Ratio Estimates Point  95% Wald Effect  Estimate  Confidence Limits oc  0.942  0.788  1.126 age  2.674  2.141  3.338
NURSE HEALTH STUDY  proc   logistic   data =nurse_study  descending ; weight  count; model  bc = oc; run ; Analysis of Maximum Likelihood Estimates Standard  Wald Parameter  DF  Estimate  Error  Chi-Square  Pr > ChiSq Intercept  1  -5.0704  0.0532  9095.8096  <.0001 oc  1  -0.3646  0.0867  17.6834  <.0001 Odds Ratio Estimates Point  95% Wald Effect  Estimate  Confidence Limits oc  0.694  0.586  0.823
CONCLUSION  ,[object Object],[object Object],[object Object]

Más contenido relacionado

La actualidad más candente

Arima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet Mahana
Arima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet MahanaArima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet Mahana
Arima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet MahanaAmrinder Arora
 
EES Procedures and Functions for Heat exchanger calculations
EES Procedures and Functions for Heat exchanger calculationsEES Procedures and Functions for Heat exchanger calculations
EES Procedures and Functions for Heat exchanger calculationstmuliya
 
Vapor Combustor Improvement Project LinkedIn Presentation February 2016
Vapor Combustor Improvement Project LinkedIn Presentation February 2016Vapor Combustor Improvement Project LinkedIn Presentation February 2016
Vapor Combustor Improvement Project LinkedIn Presentation February 2016Tim Krimmel, MEM
 
Time series Modelling Basics
Time series Modelling BasicsTime series Modelling Basics
Time series Modelling BasicsAshutosh Kumar
 
EES Functions and Procedures for Forced convection heat transfer
EES Functions and Procedures for Forced convection heat transferEES Functions and Procedures for Forced convection heat transfer
EES Functions and Procedures for Forced convection heat transfertmuliya
 
Assignment1code1
Assignment1code1Assignment1code1
Assignment1code1Luxi Guan
 
Practical Uncertainty Estimation in Load Cell Calibration
Practical Uncertainty Estimation in Load Cell CalibrationPractical Uncertainty Estimation in Load Cell Calibration
Practical Uncertainty Estimation in Load Cell CalibrationInterface Inc.
 
Aircraft propulsion non ideal turbofan cycle analysis
Aircraft propulsion   non ideal turbofan cycle analysisAircraft propulsion   non ideal turbofan cycle analysis
Aircraft propulsion non ideal turbofan cycle analysisAnurak Atthasit
 
Density of liquid refrigerants
Density of liquid refrigerantsDensity of liquid refrigerants
Density of liquid refrigerantsShyam Kumar
 
Pressure research in kriss tilt effect 04122018 ver1.67
Pressure research in kriss  tilt effect  04122018 ver1.67Pressure research in kriss  tilt effect  04122018 ver1.67
Pressure research in kriss tilt effect 04122018 ver1.67Gigin Ginanjar
 
Lesson 5 arima
Lesson 5 arimaLesson 5 arima
Lesson 5 arimaankit_ppt
 
Metodod kremser liq liq extr
Metodod kremser liq liq extrMetodod kremser liq liq extr
Metodod kremser liq liq extrFlorencio Nuñez
 
Arima model
Arima modelArima model
Arima modelJassika
 
Applied Econometrics assignment3
Applied Econometrics assignment3Applied Econometrics assignment3
Applied Econometrics assignment3Chenguang Li
 
Aircraft propulsion ideal turbofan performance
Aircraft propulsion   ideal turbofan performanceAircraft propulsion   ideal turbofan performance
Aircraft propulsion ideal turbofan performanceAnurak Atthasit
 

La actualidad más candente (20)

Arima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet Mahana
Arima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet MahanaArima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet Mahana
Arima Forecasting - Presentation by Sera Cresta, Nora Alosaimi and Puneet Mahana
 
EES Procedures and Functions for Heat exchanger calculations
EES Procedures and Functions for Heat exchanger calculationsEES Procedures and Functions for Heat exchanger calculations
EES Procedures and Functions for Heat exchanger calculations
 
Selaidechou
SelaidechouSelaidechou
Selaidechou
 
Vapor Combustor Improvement Project LinkedIn Presentation February 2016
Vapor Combustor Improvement Project LinkedIn Presentation February 2016Vapor Combustor Improvement Project LinkedIn Presentation February 2016
Vapor Combustor Improvement Project LinkedIn Presentation February 2016
 
Time series Modelling Basics
Time series Modelling BasicsTime series Modelling Basics
Time series Modelling Basics
 
EES Functions and Procedures for Forced convection heat transfer
EES Functions and Procedures for Forced convection heat transferEES Functions and Procedures for Forced convection heat transfer
EES Functions and Procedures for Forced convection heat transfer
 
Assignment1code1
Assignment1code1Assignment1code1
Assignment1code1
 
Practical Uncertainty Estimation in Load Cell Calibration
Practical Uncertainty Estimation in Load Cell CalibrationPractical Uncertainty Estimation in Load Cell Calibration
Practical Uncertainty Estimation in Load Cell Calibration
 
Presentation
PresentationPresentation
Presentation
 
Aircraft propulsion non ideal turbofan cycle analysis
Aircraft propulsion   non ideal turbofan cycle analysisAircraft propulsion   non ideal turbofan cycle analysis
Aircraft propulsion non ideal turbofan cycle analysis
 
Density of liquid refrigerants
Density of liquid refrigerantsDensity of liquid refrigerants
Density of liquid refrigerants
 
Pressure research in kriss tilt effect 04122018 ver1.67
Pressure research in kriss  tilt effect  04122018 ver1.67Pressure research in kriss  tilt effect  04122018 ver1.67
Pressure research in kriss tilt effect 04122018 ver1.67
 
Lesson 5 arima
Lesson 5 arimaLesson 5 arima
Lesson 5 arima
 
Advance thermodynamics
Advance thermodynamicsAdvance thermodynamics
Advance thermodynamics
 
ARIMA Models - [Lab 3]
ARIMA Models - [Lab 3]ARIMA Models - [Lab 3]
ARIMA Models - [Lab 3]
 
Avinash_PPT
Avinash_PPTAvinash_PPT
Avinash_PPT
 
Metodod kremser liq liq extr
Metodod kremser liq liq extrMetodod kremser liq liq extr
Metodod kremser liq liq extr
 
Arima model
Arima modelArima model
Arima model
 
Applied Econometrics assignment3
Applied Econometrics assignment3Applied Econometrics assignment3
Applied Econometrics assignment3
 
Aircraft propulsion ideal turbofan performance
Aircraft propulsion   ideal turbofan performanceAircraft propulsion   ideal turbofan performance
Aircraft propulsion ideal turbofan performance
 

Destacado

REGRESSION ANALYSIS ON HEALTH INSURANCE COVERAGE RATE
REGRESSION ANALYSIS ON HEALTH INSURANCE COVERAGE RATEREGRESSION ANALYSIS ON HEALTH INSURANCE COVERAGE RATE
REGRESSION ANALYSIS ON HEALTH INSURANCE COVERAGE RATEChaoyi WU
 
Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Salford Systems
 
Applied Multivariable Modeling in Public Health: Use of CART and Logistic Reg...
Applied Multivariable Modeling in Public Health: Use of CART and Logistic Reg...Applied Multivariable Modeling in Public Health: Use of CART and Logistic Reg...
Applied Multivariable Modeling in Public Health: Use of CART and Logistic Reg...Salford Systems
 
Case Study: American Family Insurance Best Practices for Automating Guidewire...
Case Study: American Family Insurance Best Practices for Automating Guidewire...Case Study: American Family Insurance Best Practices for Automating Guidewire...
Case Study: American Family Insurance Best Practices for Automating Guidewire...CA Technologies
 
Data mining for diabetes readmission
Data mining for diabetes readmissionData mining for diabetes readmission
Data mining for diabetes readmissionYi Chun (Nancy) Chien
 
Improve Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsImprove Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsSalford Systems
 
Predicting Hospital Readmission Using TreeNet
Predicting Hospital Readmission Using TreeNetPredicting Hospital Readmission Using TreeNet
Predicting Hospital Readmission Using TreeNetSalford Systems
 
Predictive Modeling in Insurance in the context of (possibly) big data
Predictive Modeling in Insurance in the context of (possibly) big dataPredictive Modeling in Insurance in the context of (possibly) big data
Predictive Modeling in Insurance in the context of (possibly) big dataArthur Charpentier
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionsaba khan
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetSalford Systems
 
LinkedIn SlideShare: Knowledge, Well-Presented
LinkedIn SlideShare: Knowledge, Well-PresentedLinkedIn SlideShare: Knowledge, Well-Presented
LinkedIn SlideShare: Knowledge, Well-PresentedSlideShare
 

Destacado (15)

REGRESSION ANALYSIS ON HEALTH INSURANCE COVERAGE RATE
REGRESSION ANALYSIS ON HEALTH INSURANCE COVERAGE RATEREGRESSION ANALYSIS ON HEALTH INSURANCE COVERAGE RATE
REGRESSION ANALYSIS ON HEALTH INSURANCE COVERAGE RATE
 
Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications Churn Modeling-For-Mobile-Telecommunications
Churn Modeling-For-Mobile-Telecommunications
 
Applied Multivariable Modeling in Public Health: Use of CART and Logistic Reg...
Applied Multivariable Modeling in Public Health: Use of CART and Logistic Reg...Applied Multivariable Modeling in Public Health: Use of CART and Logistic Reg...
Applied Multivariable Modeling in Public Health: Use of CART and Logistic Reg...
 
Case Study: American Family Insurance Best Practices for Automating Guidewire...
Case Study: American Family Insurance Best Practices for Automating Guidewire...Case Study: American Family Insurance Best Practices for Automating Guidewire...
Case Study: American Family Insurance Best Practices for Automating Guidewire...
 
Data mining for diabetes readmission
Data mining for diabetes readmissionData mining for diabetes readmission
Data mining for diabetes readmission
 
Improve Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForestsImprove Your Regression with CART and RandomForests
Improve Your Regression with CART and RandomForests
 
Predicting Hospital Readmission Using TreeNet
Predicting Hospital Readmission Using TreeNetPredicting Hospital Readmission Using TreeNet
Predicting Hospital Readmission Using TreeNet
 
Predictive Modeling in Insurance in the context of (possibly) big data
Predictive Modeling in Insurance in the context of (possibly) big dataPredictive Modeling in Insurance in the context of (possibly) big data
Predictive Modeling in Insurance in the context of (possibly) big data
 
Decision tree and random forest
Decision tree and random forestDecision tree and random forest
Decision tree and random forest
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Logistic management
Logistic managementLogistic management
Logistic management
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Using CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example DatasetUsing CART For Beginners with A Teclo Example Dataset
Using CART For Beginners with A Teclo Example Dataset
 
LinkedIn SlideShare: Knowledge, Well-Presented
LinkedIn SlideShare: Knowledge, Well-PresentedLinkedIn SlideShare: Knowledge, Well-Presented
LinkedIn SlideShare: Knowledge, Well-Presented
 
Slideshare ppt
Slideshare pptSlideshare ppt
Slideshare ppt
 

Similar a Analysis Of A Binary Outcome Variable

Logistic Regression in Case-Control Study
Logistic Regression in Case-Control StudyLogistic Regression in Case-Control Study
Logistic Regression in Case-Control StudySatish Gupta
 
Ch 03 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 03 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片Ch 03 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 03 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片Chyi-Tsong Chen
 
Tryptone task
Tryptone taskTryptone task
Tryptone taskYuwu Chen
 
Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Correlation and Reg...
Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Correlation and Reg...Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Correlation and Reg...
Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Correlation and Reg...J. García - Verdugo
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionKhalid Aziz
 
Statistics
StatisticsStatistics
Statisticsmegamsma
 
New lengkap karakteristik mancova
New lengkap karakteristik mancovaNew lengkap karakteristik mancova
New lengkap karakteristik mancovaAdiSusilo27
 
Chap 9 A Process Capability & Spc Hk
Chap 9 A Process Capability & Spc HkChap 9 A Process Capability & Spc Hk
Chap 9 A Process Capability & Spc Hkajithsrc
 
RESPONSE SURFACE METHODOLOGY
RESPONSE SURFACE METHODOLOGYRESPONSE SURFACE METHODOLOGY
RESPONSE SURFACE METHODOLOGYhakitasina
 
Statistic process control
Statistic process controlStatistic process control
Statistic process controlarif silaban
 
TMPA-2017: The Quest for Average Response Time
TMPA-2017: The Quest for Average Response TimeTMPA-2017: The Quest for Average Response Time
TMPA-2017: The Quest for Average Response TimeIosif Itkin
 
Power Market and Models Convergence ?
Power Market and Models Convergence ?Power Market and Models Convergence ?
Power Market and Models Convergence ?NicolasRR
 
Statistical process control (spc)
Statistical process control (spc)Statistical process control (spc)
Statistical process control (spc)Dinah Faye Indino
 
BU_FCAI_SCC430_Modeling&Simulation_Ch05-P2.pptx
BU_FCAI_SCC430_Modeling&Simulation_Ch05-P2.pptxBU_FCAI_SCC430_Modeling&Simulation_Ch05-P2.pptx
BU_FCAI_SCC430_Modeling&Simulation_Ch05-P2.pptxMaiGaafar
 

Similar a Analysis Of A Binary Outcome Variable (20)

Logistic Regression in Case-Control Study
Logistic Regression in Case-Control StudyLogistic Regression in Case-Control Study
Logistic Regression in Case-Control Study
 
Statistical Process Control
Statistical Process ControlStatistical Process Control
Statistical Process Control
 
Ch 03 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 03 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片Ch 03 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Ch 03 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
 
Cairo 02 Stat Inference
Cairo 02 Stat InferenceCairo 02 Stat Inference
Cairo 02 Stat Inference
 
Tryptone task
Tryptone taskTryptone task
Tryptone task
 
Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Correlation and Reg...
Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Correlation and Reg...Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Correlation and Reg...
Javier Garcia - Verdugo Sanchez - Six Sigma Training - W2 Correlation and Reg...
 
06 control.systems
06 control.systems06 control.systems
06 control.systems
 
Facility Location
Facility Location Facility Location
Facility Location
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Statistics
StatisticsStatistics
Statistics
 
New lengkap karakteristik mancova
New lengkap karakteristik mancovaNew lengkap karakteristik mancova
New lengkap karakteristik mancova
 
How to use statistica for rsm study
How to use statistica for rsm studyHow to use statistica for rsm study
How to use statistica for rsm study
 
Chap 9 A Process Capability & Spc Hk
Chap 9 A Process Capability & Spc HkChap 9 A Process Capability & Spc Hk
Chap 9 A Process Capability & Spc Hk
 
RESPONSE SURFACE METHODOLOGY
RESPONSE SURFACE METHODOLOGYRESPONSE SURFACE METHODOLOGY
RESPONSE SURFACE METHODOLOGY
 
Statistic process control
Statistic process controlStatistic process control
Statistic process control
 
Ch04
Ch04Ch04
Ch04
 
TMPA-2017: The Quest for Average Response Time
TMPA-2017: The Quest for Average Response TimeTMPA-2017: The Quest for Average Response Time
TMPA-2017: The Quest for Average Response Time
 
Power Market and Models Convergence ?
Power Market and Models Convergence ?Power Market and Models Convergence ?
Power Market and Models Convergence ?
 
Statistical process control (spc)
Statistical process control (spc)Statistical process control (spc)
Statistical process control (spc)
 
BU_FCAI_SCC430_Modeling&Simulation_Ch05-P2.pptx
BU_FCAI_SCC430_Modeling&Simulation_Ch05-P2.pptxBU_FCAI_SCC430_Modeling&Simulation_Ch05-P2.pptx
BU_FCAI_SCC430_Modeling&Simulation_Ch05-P2.pptx
 

Analysis Of A Binary Outcome Variable

  • 1. Analysis of a Binary Outcome Variable Using the FREQ and the LOGISTIC Procedures Arthur Li
  • 2.
  • 3.
  • 4.
  • 5. ODDS RATIO A Odds 1 = B Odds 0 = C D D C 0 B A 1 Exposure (X) 0 1 Outcome (Y) Odds Ratio = Odds 1 Odds 0 AD BC =
  • 6.
  • 7.
  • 8. PROC FREQ data breathTest; input test $ 1 - 8 neversmk $ 10 - 16 count; datalines ; abnormal current 131 normal current 927 abnormal never 38 normal never 741 ; 741 (D) 38 (C) NEVER (0) 927 (B) 131 (A) CURRENT (1) SMOKING STATUS (X) NORMAL (0) ABNORMAL (1) BREATHING TEST (Y)
  • 9. PROC FREQ proc freq data =breathTest; weight count; tables neversmk*test; run ; the data is entered directly from the cell count of the table The FREQ Procedure Table of neversmk by test neversmk test Frequency‚ Percent ‚ Row Pct ‚ Col Pct ‚abnormal‚normal ‚ Total ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ current ‚ 131 ‚ 927 ‚ 1058 ‚ 7.13 ‚ 50.46 ‚ 57.59 ‚ 12.38 ‚ 87.62 ‚ ‚ 77.51 ‚ 55.58 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ never ‚ 38 ‚ 741 ‚ 779 ‚ 2.07 ‚ 40.34 ‚ 42.41 ‚ 4.88 ‚ 95.12 ‚ ‚ 22.49 ‚ 44.42 ‚ ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 169 1668 1837 9.20 90.80 100.00
  • 10.
  • 11.
  • 12. PROC FREQ - CHISQ proc freq data =breathTest; weight count; tables neversmk*test/ relrisk chisq ; run ; Statistics for Table of neversmk by test Statistic DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 1 30.2421 <.0001 Likelihood Ratio Chi-Square 1 32.3820 <.0001 Continuity Adj. Chi-Square 1 29.3505 <.0001 Mantel-Haenszel Chi-Square 1 30.2257 <.0001 Phi Coefficient 0.1283 Contingency Coefficient 0.1273 Cramer's V 0.1283
  • 13.
  • 14.
  • 15. LOGISTIC REGRESSION MODEL Reference cell coding β: the increment in log odds for current smokers compared to those that never smoked 741 38 NEVER 927 131 CURRENT SMOKING STATUS NORMAL ABNORMAL BREATHING TEST
  • 16. LOGISTIC REGRESSION MODEL proc logistic data =breathTest; class neversmk / param =ref; weight count; model test = neversmk; run ; The LOGISTIC Procedure Model Information Data Set WORK.BREATHTEST Response Variable test Number of Response Levels 2 Weight Variable count Model binary logit Optimization Technique Fisher's scoring Number of Observations Read 4 Number of Observations Used 4 Sum of Weights Read 1837 Sum of Weights Used 1837
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26. LOGISTIC REGRESSION MODEL proc logistic data =breathTest; class neversmk ( ref = &quot;never&quot; ) / param =ref; weight count; model test = neversmk; run ; Type 3 Analysis of Effects Wald Effect DF Chi-Square Pr > ChiSq neversmk 1 28.2434 <.0001 Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -2.9704 0.1663 318.9365 <.0001 neversmk current 1 1.0136 0.1907 28.2434 <.0001 Current smoker has 1.01 increase in the log odds of having abnormal test compared to people who never smoked OR = exp(1.0136) = 2.756
  • 27. LOGISTIC REGRESSION MODEL proc logistic data =breathTest; class neversmk ( ref = &quot;never&quot; ) / param =ref; weight count; model test = neversmk; run ; Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits neversmk current vs never 2.756 1.896 4.004 Estimates of the Relative Risk (Row1/Row2) Type of Study Value 95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control (Odds Ratio) 2.7557 1.8962 4.0047 Cohort (Col1 Risk) 2.5383 1.7904 3.5987 Cohort (Col2 Risk) 0.9211 0.8960 0.9470 Sample Size = 1837 Result from PROC FREQ:
  • 28. LOGISTIC REGRESSION MODEL proc logistic data =breathTest; class neversmk ( ref = &quot;never&quot; ) / param =ref; weight count; model test = neversmk; oddsratio 'smoking' neversmk; run ; ODDSRATIO <‘label’> variable </options>; new to 9.2! Wald Confidence Interval for Odds Ratios Label Estimate 95% Confidence Limits smoking neversmk current vs never 2.756 1.896 4.004
  • 29.
  • 30. CONFOUNDING Smoking Test Age Not including Age can cause either over-/under-estimates of the relationship between Smoking & Test
  • 31. CONFOUNDING Log (odds) Non smoker smoker Smoking Test Age Adjusting age, you are comparing smoker and non-smoker at the common values of age Age Non smoker Non smoker smoker smoker < 40 ≥ 40
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37. THE PURPOSES AND STRATEGIES FOR MODEL BUILDING Is the association between “Smoking” & “Test” different in the 2 age groups? There is an interaction. Report age-specific OR No Interaction. Is “Age” a confounder? Report Crude OR Report Age-Adjusted OR Y N Y N
  • 38.
  • 39. PROC FREQ: INTERACTION EFFECT data breathTestAge; input test $ 1 - 8 neversmk $ 10 - 16 over40 $ 18 - 20 count; datalines ; normal never no 577 abnormal never no 34 normal current no 682 abnormal current no 57 normal never yes 164 abnormal never yes 4 normal current yes 245 abnormal current yes 74 ;
  • 40.
  • 41. PROC FREQ: INTERACTION EFFECT proc freq data =breathTestAge; weight count; tables over40*neversmk*test/ chisq relrisk cmh ; run ; Breslow-Day Test for Homogeneity of the Odds Ratios ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 18.0829 DF 1 Pr > ChiSq <.0001 Total Sample Size = 1837 the association between smoking status and the breathing test are not the same across different age groups
  • 42. PROC FREQ: INTERACTION EFFECT proc freq data =breathTestAge; weight count; tables over40*neversmk*test/ chisq relrisk cmh ; run ; Statistics for Table 1 of neversmk by test Controlling for over40=no Statistic DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 1 2.4559 0.1171 Likelihood Ratio Chi-Square 1 2.4893 0.1146 Continuity Adj. Chi-Square 1 2.1260 0.1448 Mantel-Haenszel Chi-Square 1 2.4541 0.1172 Phi Coefficient 0.0427 Contingency Coefficient 0.0426 Cramer's V 0.0427 Statistics for Table 1 of neversmk by test Controlling for over40=no Estimates of the Relative Risk (Row1/Row2) Type of Study Value 95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control (Odds Ratio) 1.4184 0.9144 2.2000 Cohort (Col1 Risk) 1.3861 0.9190 2.0906 Cohort (Col2 Risk) 0.9772 0.9499 1.0054 Sample Size = 1350
  • 43. PROC FREQ: INTERACTION EFFECT proc freq data =breathTestAge; weight count; tables over40*neversmk*test/ chisq relrisk cmh ; run ; Statistics for Table 2 of neversmk by test Controlling for over40=yes Statistic DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 1 35.4510 <.0001 Likelihood Ratio Chi-Square 1 45.1246 <.0001 Continuity Adj. Chi-Square 1 33.9203 <.0001 Mantel-Haenszel Chi-Square 1 35.3782 <.0001 Phi Coefficient 0.2698 Contingency Coefficient 0.2605 Cramer's V 0.2698 Estimates of the Relative Risk (Row1/Row2) Type of Study Value 95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control (Odds Ratio) 12.3837 4.4416 34.5272 Cohort (Col1 Risk) 9.7429 3.6253 26.1844 Cohort (Col2 Risk) 0.7868 0.7374 0.8394
  • 44. PROC FREQ: INTERACTION EFFECT proc freq data =breathTestAge; weight count; tables over40*neversmk*test/ chisq relrisk cmh ; run ; Summary Statistics for neversmk by test Controlling for over40 Cochran-Mantel-Haenszel Statistics (Based on Table Scores) Statistic Alternative Hypothesis DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 Nonzero Correlation 1 25.2444 <.0001 2 Row Mean Scores Differ 1 25.2444 <.0001 3 General Association 1 25.2444 <.0001 Estimates of the Common Relative Risk (Row1/Row2) Type of Study Method Value 95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control Mantel-Haenszel 2.5683 1.7618 3.7441 (Odds Ratio) Logit 1.9840 1.3252 2.9702 Cohort Mantel-Haenszel 2.4174 1.6754 3.4879 (Col1 Risk) Logit 1.8475 1.2641 2.7001 Cohort Mantel-Haenszel 0.9289 0.9046 0.9538 (Col2 Risk) Logit 0.9437 0.9195 0.9686 These statistics and its adjusted OR are only useful if there is a homogeneity in the OR across each category of the adjusting variable
  • 45. PROC LOGISTIC: INTERACTION EFFECT proc logistic data =breathTestAge; class neversmk ( ref = &quot;never&quot; ) over40 ( ref = &quot;no&quot; )/ param =ref; weight count; model test = neversmk over40 neversmk*over40; run ;
  • 46. PROC LOGISTIC: INTERACTION EFFECT proc logistic data =breathTestAge; class neversmk ( ref = &quot;never&quot; ) over40 ( ref = &quot;no&quot; )/ param =ref; weight count; model test = neversmk over40 neversmk*over40; run ; Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -2.8315 0.1765 257.4193 <.0001 neversmk current 1 0.3495 0.2240 2.4355 0.1186 over40 yes 1 -0.8820 0.5359 2.7086 0.0998 neversmk*over40 current yes 1 2.1668 0.5691 14.4985 0.0001 Wald Test:
  • 47. PROC LOGISTIC: INTERACTION EFFECT proc logistic data =breathTestAge; class neversmk ( ref = &quot;never&quot; ) over40 ( ref = &quot;no&quot; )/ param =ref; weight count; model test = neversmk over40 neversmk*over40; run ; Likelihood Ratio Test:
  • 48. PROC LOGISTIC: INTERACTION EFFECT proc logistic data =breathTestAge; class neversmk ( ref = &quot;never&quot; ) over40 ( ref = &quot;no&quot; )/ param =ref; weight count; model test = neversmk over40 neversmk*over40; run ; Model Fit Statistics Intercept Intercept and Criterion Only Covariates AIC 1130.417 1055.467 SC 1130.497 1055.785 -2 Log L 1128.417 1047.467 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 80.9500 3 <.0001 Score 95.7956 3 <.0001 Wald 81.3305 3 <.0001
  • 49. PROC LOGISTIC: INTERACTION EFFECT proc logistic data =breathTestAge; class neversmk ( ref = &quot;never&quot; ) over40 ( ref = &quot;no&quot; )/ param =ref; weight count; model test = neversmk over40; run ; Model Fit Statistics Intercept Intercept and Criterion Only Covariates AIC 1130.417 1074.123 SC 1130.497 1074.361 -2 Log L 1128.417 1068.123 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 60.2942 2 <.0001 Score 61.2515 2 <.0001 Wald 56.4737 2 <.0001
  • 50. PROC LOGISTIC: INTERACTION EFFECT proc logistic data =breathTestAge; class neversmk ( ref = &quot;never&quot; ) over40 ( ref = &quot;no&quot; )/ param =ref; weight count; model test = neversmk over40 neversmk*over40; ods output FitStatistics = log2Ratio_full GlobalTests = df_full; data _null_ ; set log2Ratio_full; if Criterion = '-2 Log L' ; call symput( 'neg2L_full' , InterceptAndCovariates); data _null_ ; set df_full; if Test = 'Likelihood Ratio' ; call symput( 'df_full' , DF);
  • 51. PROC LOGISTIC: INTERACTION EFFECT proc logistic data =breathTestAge; class neversmk ( ref = &quot;never&quot; ) over40 ( ref = &quot;no&quot; )/ param =ref; weight count; model test = neversmk over40; ods output FitStatistics = log2Ratio_reduce GlobalTests = df_reduce; data _null_ ; set log2Ratio_reduce; if Criterion = '-2 Log L' ; call symput( 'neg2L_reduce' , InterceptAndCovariates); data _null_ ; set df_reduce; if Test = 'Likelihood Ratio' ; call symput( 'df_reduce' , DF); run ;
  • 52. PROC LOGISTIC: INTERACTION EFFECT data result; LR = &neg2L_reduce - &neg2L_full; df = &df_full - &df_reduce; p = 1 -probchi(LR,df); label LR = 'Likelihood Ratio' ; proc print data =result label noobs ; title &quot;Likelihood ratio test&quot; ; run ; Likelihood ratio test Likelihood Ratio df p 20.6558 1 .000005497
  • 53. PROC LOGISTIC: INTERACTION EFFECT proc logistic data =breathTestAge; class neversmk ( ref = &quot;never&quot; ) over40 ( ref = &quot;no&quot; )/ param =ref; weight count; model test = neversmk over40 neversmk*over40; oddsratio neversmk/ at (over40 = 'no' ) ; oddsratio neversmk/ at (over40 = 'yes' ); run ; Wald Confidence Interval for Odds Ratios Label Estimate 95% Confidence Limits neversmk current vs never at over40=no 1.418 0.914 2.200 neversmk current vs never at over40=yes 12.383 4.441 34.525
  • 54.
  • 55. NURSE HEALTH STUDY data nurse_study; input bc age oc count; datalines ; 1 0 1 71 0 0 1 28418 1 0 0 35 0 0 0 12267 1 1 1 143 0 1 1 20661 1 1 0 321 0 1 0 44424 ; BREAST CANCER 35 71 CASE (1) AGE 30 – 39 (0) 12267 28418 CONTROL (0) 44424 321 NO (0) 20651 143 YES (1) OC USE CONTROL (0) CASE (1) AGE 40 – 55 (1)
  • 56. NURSE HEALTH STUDY proc freq data =nurse_study order =data; weight count; tables age*oc*bc/ chisq relrisk cmh ; run ; Breslow-Day Test for Homogeneity of the Odds Ratios ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 0.1521 DF 1 Pr > ChiSq 0.6966 There is no interaction Check for confounding
  • 57. NURSE HEALTH STUDY Summary Statistics for oc by bc Controlling for age Cochran-Mantel-Haenszel Statistics (Based on Table Scores) Statistic Alternative Hypothesis DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 1 Nonzero Correlation 1 0.4361 0.5090 2 Row Mean Scores Differ 1 0.4361 0.5090 3 General Association 1 0.4361 0.5090 Estimates of the Common Relative Risk (Row1/Row2) Type of Study Method Value 95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control Mantel-Haenszel 0.9419 0.7882 1.1256 (Odds Ratio) Logit 0.9415 0.7882 1.1246 Cohort Mantel-Haenszel 0.9422 0.7897 1.1243 (Col1 Risk) Logit 0.9419 0.7894 1.1238 Cohort Mantel-Haenszel 1.0003 0.9994 1.0013 (Col2 Risk) Logit 1.0003 0.9995 1.0012
  • 58. NURSE HEALTH STUDY proc freq data =nurse_study order =data; weight count; tables oc*bc/ chisq relrisk ; run ; Statistics for Table of oc by bc Statistic DF Value Prob ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Chi-Square 1 17.8881 <.0001 Likelihood Ratio Chi-Square 1 18.1401 <.0001 Continuity Adj. Chi-Square 1 17.5337 <.0001 Mantel-Haenszel Chi-Square 1 17.8879 <.0001 Phi Coefficient -0.0130 Contingency Coefficient 0.0130 Cramer's V -0.0130 Statistics for Table of oc by bc Estimates of the Relative Risk (Row1/Row2) Type of Study Value 95% Confidence Limits ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Case-Control (Odds Ratio) 0.6944 0.5858 0.8230 Cohort (Col1 Risk) 0.6957 0.5874 0.8239 Cohort (Col2 Risk) 1.0019 1.0010 1.0028
  • 59.
  • 60. NURSE HEALTH STUDY proc logistic data =nurse_study descending ; weight count; model bc = oc age; run ; Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -5.9083 0.1156 2612.5788 <.0001 oc 1 -0.0602 0.0911 0.4360 0.5090 age 1 0.9835 0.1133 75.3707 <.0001 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits oc 0.942 0.788 1.126 age 2.674 2.141 3.338
  • 61. NURSE HEALTH STUDY proc logistic data =nurse_study descending ; weight count; model bc = oc; run ; Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -5.0704 0.0532 9095.8096 <.0001 oc 1 -0.3646 0.0867 17.6834 <.0001 Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits oc 0.694 0.586 0.823
  • 62.