Vcd4 Handout 2x2

Visualizing Categorical Data with SAS and R
Michael Friendly
Part 4: Model-based methods for categorical data

logit(Admit) = Dept DeptA*Gender
2
Arthritis treatment data Linear and Logit Regressions on Age 1.0 Probability (Improved) 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
1 Log Odds (Admitted)
York University
-1
Short Course, 2012

Web notes: datavis.ca/courses/VCD/
-2
Gender -3
Female Male
20
A B C D Department E F
30
40
50 AGE
60
70
80
40 35 30
Hazel Green Blue
Unaided distant vision data

High
Right Eye Grade
-3.1
Topics: Logit models

2.3 7.0
Sqrt(frequency)
25 20 15 10 5 0 -5 0 2 4 6 8 10 Number of males 12
Plots for logit models Diagnostic plots for generalized linear models
Logistic regression models

3
4.4
Low High 2 3 Low

Black Brown Red
-2.2 -5.9 Blond
Logistic regression: Binary response Model plots Eect plots for generalized linear models Inuence measures and diagnostic plots
2 / 77
Left Eye Grade
Logit models
Brown
Logit models
Modeling approaches: Overview
Logit models
For a binary response, each loglinear model is equivalent to a logit model (logistic regression, with categorical predictors) e.g., Admit Gender | Dept (conditional independence [AD][DG])
D G AD DG log mijk = + A i + j + k + ij + jk
So, for admitted (i = 1) and rejected (i = 2), we have:

AD G D & + A log m1jk = DG 1 + k + 1j + & jk j + AD D G & log m2jk = + A DG 2 + j + k + 2j + & jk
(7) (8)
Thus, subtracting (7)-(8), terms not involving Admit will cancel: Ljk = = where,
: overall log odds of admission jDept : eect on admissions of department, associations among predictors are assumed, but dont appear in the logit model 4 / 77
log m1jk log m2jk = log(m1jk /m2jk ) = log odds of admission

A AD AD (A 1 2 ) + (1j 2j )
= + jDept
(renaming terms)
3 / 77
Logit models
Logit models
Fitting logit models
Logit models
Other loglinear models have similar, simpler forms as logit models, where only the relations of the response to the predictors appear in the equivalent logit model. Admit Gender Dept (mutual independence [A][D][G]) log mijk Ljk
D G = + A i + j + k
Logit models: Overview

Fitting procedures
PROC CATMOD, PROC LOGISTIC PROC GENMOD / dist=poisson SPSS: Logistic regression, Loglinear Logit, Generalized Linear Models R: glm(), gnm()
A (A 1 2 ) =
(constant log odds)
Visualization procedures
CATPLOT macro - plot predicted, observed log odds from CATMOD INFLGLIM macro - inuence plots for generalized linear models HALFNORM macro - half-normal plot of residuals for generalized linear models
Admit Gender | Dept, except for Dept. A log mijk Ljk where,
jDept : eect on admissions for department j , (j =1) Gender : 1 df term for eect of gender in Dept. A.
D G AD DG AG + A i + j + k + ij + jk + (j =1) ik
= log(m1jk /m2jk ) = + jDept + (j =1) Gender
SAS craft
All SAS procedures output dataset with obs., tted values, residuals, diagnostics, etc. New model new output dataset Plotting steps remain the same Similar ideas for SPSS, R
5 / 77 Logit models Plots for logit models Logit models Plots for logit models
6 / 77
Plots for logit models

Fit: PROC CATMOD; plot: CATPLOT macro Model: Admit Gender + Dept loglinear [AD] [AG] [DG]
proc catmod order=data data=berkeley; weight freq; response / out=predict; model admit = dept gender / ml; %catplot(data=predict, xc=dept, class=gender, type=FUNCTION, z=1.96, legend=legend1);
Model: logit(Admit) = Dept Gender .90 2

.75
.50
Plots observed and predicted on the logit scale (type=FUNCTION) Main eects model parallel proles Probabilities on a separate scale (added below)
Probability (Admitted)
-1
.25
.75 Probability (Admitted)
-2 .10
.50
Gender -3
Female Male
.05
-1
.25
C D Department
-2 .10
Gender -3
Female Male
.05
C D Department
7 / 77
8 / 77
Logit models
Logit models
Logit models: details

Model: Admit Gender + Dept [AD] [AG] [DG]
1 2 3 4 5 6 7
Plots for logit models: Output data set

catberk2.sas
PROC CATMOD output data set: observed & predicted, probabilities & logits
dept A A A A A A B B B B B B ... F F F F F F gender Male Male Male Female Female Female Male Male Male Female Female Female Male Male Male Female Female Female admit Admit Reject Admit Reject Admit Reject Admit Reject Admit Reject Admit Reject _TYPE_ FUNCTION PROB PROB FUNCTION PROB PROB FUNCTION PROB PROB FUNCTION PROB PROB FUNCTION PROB PROB FUNCTION PROB PROB _OBS_ 0.492 0.621 0.379 1.544 0.824 0.176 0.534 0.630 0.370 0.754 0.680 0.320 -2.770 0.059 0.941 -2.581 0.070 0.930 _PRED_ 0.582 0.642 0.358 0.682 0.664 0.336 0.539 0.631 0.369 0.639 0.654 0.346 -2.724 0.062 0.938 -2.625 0.068 0.932 _SEPRED_ 0.069 0.016 0.016 0.099 0.022 0.022 0.086 0.020 0.020 0.116 0.026 0.026 0.158 0.009 0.009 0.158 0.010 0.010
%include catdata(berkeley); proc catmod order=data data=berkeley; weight freq; response / out=predict; model admit = dept gender / ml; run;
PROC CATMOD output: Overall tests and goodness of t

Maximum Likelihood Analysis of Variance Source DF Chi-Square Pr > ChiSq -------------------------------------------------Intercept 1 262.49 <.0001 dept 5 534.78 <.0001 gender 1 1.53 0.2167 Likelihood Ratio 5 20.20 0.0011
No eect of Gender; big eect of Dept LR test (vs. saturated model): Model doesnt t well Why? How to modify?
9 / 77 Logit models Plots for logit models
This contains both the observed and tted logit values (_TYPE_='FUNCTION') and probabilities (_TYPE_='PROB')
10 / 77 Logit models CATPLOT macro
CATPLOT macro
Plot logit values (_TYPE_='FUNCTION') or probabilities (_TYPE_='PROB') With PSCALE macro, can plot on logit scale, with probability scale on right.
CATPLOT macro
.75
9 10 11 12 13 14 15 16 17 18 19 20
catberk2.sas %pscale(lo=-4, hi=3, anno=pscale); title 'Model: logit(Admit) = Dept Gender' a=-90 'Probability (Admitted)'; axis1 order=(-3 to 2) offset=(4) label=(a=90 'Log Odds (Admitted)'); axis2 label=('Department') offset=(4); %catplot(data=predict, class=gender, xc=dept, type=FUNCTION, /* plot logit values */ z=1.96, /* show 1.96 x SE -> 95% CI */ anno=pscale); /* add probability scale */
Log Odds (Admitted)
Probability (Admitted)
.50
-1
.25
-2 .10
Gender -3
Female Male
.05
C D Department
11 / 77
no eect of Gender, except in Dept A (Females more likely admitted!)
12 / 77
Logit models
CATPLOT macro
Logit models
CATPLOT macro
Fitting and graphing other models

Change MODEL statement new tted values Plotting step remains the same Admit Gender | Dept, except for Dept. A Admit Dept + j =1 Gender
1 2 3 4 5 6 7 8
Fitting and graphing other models: details

Model: Admit Gender | Dept, except for Dept. A
catberk6.sas %include catdata(berkeley); data berkeley; set berkeley; *-- Dummy variable for Gender in Dept A; dept1AG = (gender='F') * (dept=1); format dept dept.; proc catmod order=data data=berkeley; weight freq; population dept gender; direct dept1AG; response / out=predict; model admit = dept dept1AG / ml; run; ...
proc catmod order=data data=berkeley; response / out=predict; model admit = dept dept1AG / ml; %catplot(data=predict, xc=dept, class=gender, type=FUNCTION, z=1.96, legend=legend1);
2
Need to dene a dummy variable for eect of Gender in Dept. A
9 10 11 12 13 14 15 16
Gender
-1
-2
-3
Female Male
C D Department
13 / 77 Logit models CATPLOT macro Logit models CATPLOT macro
14 / 77
Fitting and graphing other models:details

PROC CATMOD output:
Maximum Likelihood Analysis of Variance Source DF Chi-Square Pr > ChiSq -------------------------------------------------Intercept 1 291.22 <.0001 dept 5 571.45 <.0001 dept1AG 1 16.04 <.0001 Likelihood Ratio 5 2.68 0.7489
17 18 19 20 21

PROC CATMOD: observed and predicted logits:
catberk6.sas proc print data=predict; id dept gender; var _obs_ _pred_ _sepred_; format _numeric_ 6.3 dept dept.; where(_type_='FUNCTION'); dept A A B B C C D D E E F F gender M F M F M F M F M F M F _OBS_ 0.492 1.544 0.534 0.754 -0.536 -0.660 -0.704 -0.622 -0.957 -1.157 -2.770 -2.581 _PRED_ 0.492 1.544 0.543 0.543 -0.616 -0.616 -0.665 -0.665 -1.090 -1.090 -2.676 -2.676 _SEPRED_ 0.072 0.253 0.086 0.086 0.069 0.069 0.075 0.075 0.095 0.095 0.152 0.152
Analysis of Maximum Likelihood Estimates Standard ChiParameter Estimate Error Square Pr > ChiSq -------------------------------------------------------Intercept -0.6685 0.0392 291.22 <.0001 dept A 1.1606 0.0705 271.21 <.0001 B 1.2113 0.0802 227.95 <.0001 C 0.0528 0.0687 0.59 0.4426 D 0.00358 0.0727 0.00 0.9607 E -0.4210 0.0871 23.34 <.0001 dept1AG 1.0521 0.2627 16.04 <.0001
Fits well! How to interpret?

15 / 77 16 / 77
Logit models
CATPLOT macro
Logit models
Diagnostic plots for GLMs

22 23 24 25
catberk6.sas title 'logit(Admit) = Dept DeptA*Gender'; %catplot(data=predict, x=dept, class=gender, type=FUNCTION, /* plot the log odds */ z=1.96); /* 95% error bars */
Diagnostic plots for Generalized Linear Models

INFLGLIM macro: Inuence plots for generalized linear models (Williams, 1987) Fit: PROC GENMOD; calculates additional diagnostic measures (Hat value, Cooks D, etc.) Plot: measures of residual (GY=2 , 2 residual) vs. leverage (GX=hat value), bubble size (area, radius) Cooks D. which cells have undue impact on tted model?

2
-1
-2
Gender -3
Female Male
C D Department
17 / 77 Logit models Diagnostic plots for GLMs Logit models Diagnostic plots for GLMs
18 / 77
INFLGLIM macro: Example

Berkeley data, model [AD ][GD ] Lij = + jDept
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
INFLGLIM macro: Example
genberk1.sas %include catdata(berkeley); *-- make a cell ID variable, joining factors; data berkeley; set berkeley; cell = trim(put(dept,dept.)) || gender || trim(put(admit,yn.)); %inflglim(data=berkeley, class=dept gender admit, resp=freq, model=admit|dept gender|dept, dist=poisson, id=cell, gx=hat, gy=streschi);
19 / 77
All cells which do not t (|ri | > 2) are for department A. Males applying to dept A have large leverage large inuence (Cooks D)
20 / 77
Logit models
Logit models
Inuence plots in R
The influencePlot() function in the car package gives similar plots:
1 2 3 4 5
Diagnostic plots for Generalized Linear Models
berkeley-diag.R berkeley <- as.data.frame(UCBAdmissions) ... berk.mod <- glm(Freq ~ Dept * (Gender+Admit), data=berkeley, family="poisson") influencePlot(berk.mod, id.n=3, id.col="red")
4 AFAdm AMRej
HALFNORM macro: Half-normal plot of residuals (Atkinson, 1981) Plot ordered absolute residuals, |r |(i ) vs. expected normal values, |z |(i ) Standard normal condence envelope not suitable for GLMs Simulate reference line and envelope with simulated condence intervals
1 2
Studentized Residuals
FMRej BMRej 0 BMAdm
3 4 5
genberk1.sas %halfnorm(data=berkeley, class=dept gender admit, resp=freq, model=dept|gender dept|admit, dist=poisson, id=cell);
AFRej 0.4 0.5 0.6 0.7 HatValues 0.8
AMAdm
0.9
1.0
21 / 77 Logit models Diagnostic plots for GLMs Logistic regression models
22 / 77
5 AFAbsolute Std Deviance Residual 4 AM-AM+ AF+

Response variable
Binary response: success/failure, vote: yes/no Binomial data: x successes in n trials (grouped data) Ordinal response: none < some < severe depression Polytomous response: vote Liberal, Tory, NDP, Green
Explanatory variables
1 EF+
0 0 1 2 3 Expected value of half normal quantile
Quantitative regressors: age, dose Transformed regressors: age, log(dose) Polynomial regressors: age2 , age3 , Categorical predictors: treatment, sex Interaction regessors: treatment age, sex age
Points with largest |residual| labeled The model ts well, except in department A.
23 / 77 24 / 77
Binary response
Binary response
Logistic regression models: Binary response

For a binary response, Y (0, 1), want to predict = Pr(Y = 1 | x ) Linear regression will give predicted values outside 0 1 Logistic model:
logit(i ) log[/(1 )] avoids this problem logit is interpretable as log odds that Y = 1

Quantitative predictor: Linear and Logit regression on age Except in extremes, linear and logistic models give similar predicted values
Arthritis treatment data Linear and Logit Regressions on Age 1.0
Probability (Improved)
Probit (normal transform) model similar predictions, but is less interpretable

1.0
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 20 30 40 50 AGE 60 70 80
Linear
.75
Logistic Normal
Probability
.50
.25
.00 -3 -2 -1 0 Predictor 1 2 3
25 / 77 Logistic regression models Binary response Logistic regression models Fitting logistic models
26 / 77

For a binary response, Y (0, 1), let x be a vector of p regressors, and i be the probability, Pr(Y = 1 | x). The logistic regression model is a linear model for the log odds , or logit that Y = 1, given the values in x, logit(i ) log i 1 i = + xT i = + 1 xi 1 + 2 xi 2 + + p xip

Fitting
PROC LOGISTIC (or ROBUST macro M-estimation) Data:
Frequency form (from PROC FREQ) when all predictors are discrete Case form when any predictors are quantitative
Models:
CLASS statement (V7+) no need for dummy variables
discrete predictors can specify order and parameterization (eect, polynomial, reference cell)
An equivalent (non-linear) form of the model may be specied for the probability, i , itself, i = {1 + exp([ + xT i ])}
1
MODEL statement allows GLM syntax, e.g., proc logistic; class Sex Treat; model Better = Sex | Treat | Age @2; Better = Sex Treat Age Sex*Treat Sex*Age Treat*Age
so, increasing xij by 1 increases logit(i ) by j , and multiplies the odds by e j .

27 / 77 28 / 77
The logistic model is a linear model for the log odds, but also a multiplicative model for the odds of success, i T = exp( + xT i ) = exp() exp(xi ) 1 i
Visualizing logistic models
Visualizing logistic models

Visualization
Goal: see and understand the data and tted model LOGODDS macro: Plot observed responses, tted and smoothed probabilities Model plots:
OUTPUT statement
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Example: Arthritis treatment data

Predictors: Sex, Treatment (treated, placebo), Age Response: improvement (none, some, marked)
Consider rst as binary response: None vs. (Some or Marked)=Better arthrit.sas data arthrit; length treat $7. sex $6. ; input id treat $ sex $ age improve @@ ; case = _n_; better = (improve > 0); *-- Make binary datalines ; 57 Treated Male 27 1 9 Placebo Male 37 46 Treated Male 29 0 14 Placebo Male 44 77 Treated Male 30 0 73 Placebo Male 50 ... (observations omitted ) 56 Treated Female 69 1 42 Placebo Female 66 43 Treated Female 70 1 15 Placebo Female 66 71 Placebo Female 68 1 Placebo Female 74 ;
Data in case form:
Plot with standard procedures (PROC GCHART, GPLOT) Utility macros (BARS, LABEL, POINTS, PSCALE, etc.) for custom displays
tted i , lower/upper (1 ) CI, and/or tted logit, ( + xT i ) z1/2 se (logit)
response; 0 0 0 0 1 1 2
Eect plots plot hierarchical subset of eects, averaging over those not included. INFLOGIS macro: Inuence plots for logistic regression models ADDVAR macro: Added variable plots for new predictors or transformations of old
29 / 77 Logistic regression models Empirical logit plots Logistic regression models Empirical logit plots
30 / 77
LOGODDS macro: Empirical logit plots

Problems with visualizing discrete outcomes:
Log Odds Better=1
Linearity: Is a linear relation realistic? Smoothing: Discrete data often requires smoothing to see!
The LOGODDS macro: Show the data: Plot (0/1) responses [stacked or jittered]
yi +1/2 Divide X into groups (e.g., deciles), emprical logit, log ni yi +1/2 , for each Linear logistic regression, plus smoothed curve (LOWESS macro)
-1
1 2 3 4 5
%include catdata(arthrit); %logodds(data=arthrit, x=age, y=Better, /* vars to plot */ smooth=0.5, /* LOWESS smoothing parameter */ plot=logit); /* plot on logit scale */
-2
-3 20
31 / 77
30
40
50 AGE
60
70
80
32 / 77
Empirical logit plots
PROC LOGISTIC: Fitting and plotting
Smoothing the binary observations

Can also use direct smoothing:
Arthritis data: linear logistic and lowess smooth
1.0
PROC LOGISTIC: Model tting and plotting

Specify ordering of response levels (order= or descending options) Specify parameterizations for CLASS variables OUTPUT statement to get tted logits and probabilities
1
glogist1c.sas proc logistic data=arthrit descending; class sex (ref=last) treat (ref=first) / param=ref; model better = sex treat age; output out=results p=prob l=lower u=upper xbeta=logit stdxbeta=selogit / alpha=.33;
Prob (Better)
0.8
0.6
2 3 4 5 6 7
0.2
0.4
0.0
The output includes:

30 40 50 Age 60 70
Type III Analysis of Effects Effect DF 1 1 1 Wald Chi-Square 6.2576 10.7596 5.5655 Pr > ChiSq 0.0124 0.0010 0.0183
34 / 77 Logistic regression models PROC LOGISTIC: Fitting and plotting
SAS: PROC LOESS, lowess macro; R: lowess() There is a hint that the relation may be non-linear But data is thin at the extremes
sex treat age
Analysis of Maximum Likelihood Estimates Parameter Intercept sex Female treat Treated age DF 1 1 1 1 Estimate -4.5033 1.4878 1.7598 0.0487 Standard Error 1.3074 0.5948 0.5365 0.0207 Wald Chi-Square 11.8649 6.2576 10.7596 5.5655 Pr > ChiSq 0.0006 0.0124 0.0010 0.0183
PROC LOGISTIC: Full-model plots

Full-model plots display the tted (predicted) values over all combinations ofpredictors: Plot tted values from the dataset specied on the OUTPUT statement Plot either predicted probabilities or logits Condence intervals or standard errors allow showing error bars The rst few observations from the results dataset:
id sex 57 Male 9 Male 46 Male 14 Male 77 Male 73 Male ... treat Treated Placebo Treated Placebo Treated Placebo age better 27 37 29 44 30 50 1 0 0 0 0 0 prob 0.194 0.063 0.209 0.086 0.217 0.112 lower 0.103 0.032 0.115 0.047 0.122 0.065 upper 0.334 0.120 0.350 0.152 0.357 0.188 logit selogit -1.427 -2.700 -1.330 -2.358 -1.281 -2.066 0.758 0.725 0.728 0.658 0.713 0.622
Odds Ratio Estimates Effect sex Female vs Male treat Treated vs Placebo age Point Estimate 4.427 5.811 1.050 95% Wald Confidence Limits 1.380 2.031 1.008 14.204 16.632 1.093
Parameter estimates (reference cell coding): 1 = 1.49 Females e 1.49 =4.43 more likely better than Males 2 = 1.76 Treated e 1.76 =5.81 more likely better than Placebo 3 = 0.0487 odds ratio=1.05 odds of improvement increase 5% each year. Over 10 years, odds of improvement = e 100.0486 = 1.63, a 63% increase.
35 / 77
prob predicted probabilities, with CI (lower ,upper ) logit predicted logit, with standard error selogit
36 / 77

Basic plots: Plot either logit or probability vs. one predictor (continuous or most levels) Separate curves for one factor (= factor) Separate panels for all others (BY statement)
1 2 3 4 5
PROC LOGISTIC: Model plots

Enhanced plots: Plot on logit scale, with probability scale at right (PSCALE macro) Show 67% error bars 1 se (BARS macro) Custom legend and panel labels (LABEL macro)
3 .95 3 .95
Female
2 Treated .90 2
Male
.90
Log Odds Improved
.70 Placebo .60 .50 .40
Log Odds Improved
proc gplot data=results; plot (logit prob) * age = treat; by sex; symbol1 v=circle i=join l=3 c=black; symbol2 v=dot i=join l=1 c=red;
/* /* /* /*
separate curves */ separate panels */ placebo */ treated */
.80 1 1 Treated 0 Probability Improved
.80 .70 .60 .50 .40 -1 Placebo -2 .30 .20 Probability Improved
SYMBOL statement dene the point value (v=), interpolate option (i=), line style (l=), color (c=), etc.
-1
.30 .20
-2
.10
.10
-3 20 30 40 50 Age
.05 60 70 80
-3 20 30 40 50 Age 60 70 80
.05

Enhanced plots:
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45
glogist1c.sas *-- Error bars, on logit scale; %bars(data=results, var=logit, class=age, cvar=treat, by=age, barlen=selogit, out=bars); *-- Custom legends and panel labels; %label(data=results, y=logit, x=age, xoff=1, cvar=treat, by=sex, subset=last.treat, out=label1, pos=6, text=treat); %label(data=results, y=2.5, x=20, size=2, by=sex, subset=first.sex, out=label2, pos=6, text=sex); *-- Probability scales at right; %pscale(out=pscale, byvar=sex, byval=%str('Female','Male'));
title ' ' h=1.8 a=-90 'Probability Improved' /* right axis label */ h=2.5 a=-90 ' '; /* extra space */ goptions hby=0; /* suppress BY values */ proc gplot data=results; plot logit * age = treat / vaxis=axis1 haxis=axis2 hm=1 vm=1 nolegend anno=bars frame; by sex; axis1 label=(a=90 'Log Odds Improved') order=(-3 to 3); axis2 order=(20 to 80 by 10) offset=(2,6); symbol1 v=+ i=join l=3 c=black; symbol2 v=- i=join l=1 c=red; label age='Age'; run;
3 .95 3 .95
glogist1c.sas
Female
2 Treated .90 2
Male
.90
.80 Log Odds Improved .70 Placebo 0 .60 .50 .40 -1 .30 .20 -2 -2 Log Odds Improved 1 1 Treated 0 Probability Improved
.80 .70 .60 .50 .40 -1 Placebo .30 .20 Probability Improved
*-- Join ANNOTATE datasets; data bars; set label1 label2 bars pscale; proc sort; by sex;
.10
.10
-3 20 30 40 50 Age 60 70 80
.05
-3 20 30 40 50 Age 60 70 80
.05
39 / 77
40 / 77
Eect plots
General ideas
Models with interactions

Plotting tted values
Only need to change the MODEL statement Output dataset automatically incorporates all model terms Plotting steps remain exactly the same
1 2 3 4 5
Eect plots: basic ideas

Show a given eect (and low-order relatives) controlling for other model eects.
proc logistic data=arthrit descending; class sex (ref=last) treat (ref=first) / param=ref; model better = treat sex | age @2;; output out=results p=prob l=lower u=upper xbeta=logit stdxbeta=selogit / alpha=.33;
41 / 77 Eect plots General ideas Eect plots Eect plots software
42 / 77
Eect plots for generalized linear models: Details

For simple models, full model plots show the complete relation between response and all predictors . Fox (1987) For complex models, often wish to plot a specic main eect or interaction (including lower-order relatives) controlling for other eects
Fit full model to data with linear predictor (e.g., logit) = X and link function g () = estimate b of and covariance matrix V (b) of b. Vary each predictor in the term over its range Fix other predictors at typical values (mean, median, proportion in the data) eect model matrix, X Calculate tted eect values, = X b. Standard errors are square roots of diag(X V (b)X T ) Plot , or values transformed back to scale of response, g 1 ( ).
Eect plots software

General method
Create a grid of values for predictors in the eect (EXPGRID macro) Fix other predictors at typical values (mean, median, proportion in the data) Concatenate grid with data Fit model output data set tted values in the grid Standard errors automatically calculated Plot tted values in the grid
EFFPLOT macro
Works with PROC REG, PROC GLM, PROC LOGISTIC, PROC GENMOD Uses MEANPLOT macro to do the plotting Some limitations cant plot correct standard errors
SAS 9.3 ODS Graphics

Several procedures now do eects-like plots: LOGISTIC, GLM, GLIMMIX Easy; PROC LOGISTIC quite exible
Note : This provides a general means to visualize interactions in all linear and generalized linear models.
R: eects package
Most general: Handles linear models (lm()), generalized linear models (glm()), multinomial (multinom()) and proportional-odds (polr()) models. allEffects(model) calculates eects for all high-order terms in model plot(allEffects(model)) plots them
44 / 77
43 / 77
Eect plots
Eect plots software
Eect plots
Eect plots software
Eect plots: Example

Cowles and Davis (1987) Volunteering for a psychology experiment
Predictors: Sex, Neuroticism, Extraversion strong interaction, Neuroticism Extraversion
1 2 3 4 5 6 7
Eect plots: SAS 9.3 ODS Graphics

cowles-logistic-eff.sas proc logistic data=cowles outest=parm descending ; class Sex; model Volunteer = Sex Extraver | Neurot / lackfit ; effectplot slicefit(x=Extraver sliceby=Neurot) / at(sex=1.5) noobs; effectplot slicefit(x=Neurot sliceby=Extraver) / at(sex=1.5) noobs; effectplot contour(x=Neurot y=Extraver) / at(sex=1.5) noobs; run;
45 / 77 Eect plots Eect plots software Eect plots Eect plots software
46 / 77
Eect plots: SAS 9.3 ODS Graphics

1 2 3 4 5
SAS 9.2: ODS Graphics

1 2 3 4 5 6 7 8
cowles-logistic-eff.sas proc logistic data=cowles outest=parm descending ; class Sex; model Volunteer = Sex Extraver | Neurot / lackfit ; effectplot contour(x=Neurot y=Extraver) / at(sex=1.5) noobs; run;
arthritis-logistic-ods.sas %include catdata(arthrit); ods graphics on; proc logistic data=arthrit descending plots(only)=(effect(plotby=sex sliceby=treat showobs clband alpha=0.33)); class sex (ref=last) treat (ref=first) / param=ref; model better = sex treat age / clodds=wald; run; ods graphics off;
47 / 77
48 / 77
Eect plots
The eects package in R
Eect plots
The eects package in R
Eect plots with the effects package in R

> > > + > library(effects) ## load the effects package data(Cowles) mod.cowles <- glm(volunteer ~ sex + neuroticism*extraversion, data=Cowles, family=binomial) summary(mod.cowles)
Eect plots with the effects package in R

Calculate eects for all model terms, plot neuro:extra:
> eff.cowles <- allEffects(mod.cowles, + xlevels=list(neuroticism=0:24, + extraversion=seq(0, 24, 8))) > > plot(eff.cowles, 'neuroticism:extraversion', ylab="Prob(Volunteer)", + ticks=list(at=c(.1,.25,.5,.75,.9)), layout=c(4,1), aspect=1)
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -2.358207 0.501320 -4.704 2.55e-06 sexmale -0.247152 0.111631 -2.214 0.02683 neuroticism 0.110777 0.037648 2.942 0.00326 extraversion 0.166816 0.037719 4.423 9.75e-06 neuroticism:extraversion -0.008552 0.002934 -2.915 0.00355 --Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 1933.5 Residual deviance: 1897.4 AIC: 1907.4 on 1420 on 1416 degrees of freedom degrees of freedom *** * ** *** **
neuroticism*extraversion effect plot

0 5 10 15 20 25 0 5 10 15 20 25
extraversion Prob(Volunteer)
0.9 0.75 0.5 0.25 0.1
extraversion
extraversion
extraversion
10 15 20 25
10 15 20 25
neuroticism
49 / 77 Eect plots Arrests Eect plots Arrests 50 / 77
Extended example: Arrests for Marihuana Possession

Context & background

Data
Control variables: In Dec. 2002, the Toronto Star examined the issue of racial proling, by analyzing a data base of 600,000+ arrest records from 1996-2002. They focused on a subset of arrests for which police action was discretionary, e.g., simple possession of small quantities of marijuana, where the police could:
Release the arrestee with a summons like a parking ticket Bring to police station, hold for bail, etc. harsher treatment
year, age, sex employed, citizen Yes, No checks Number of police data bases (previous arrests, previous convictions, parole status, etc.) in which the arrestees name was found.
1 2 3 1 2 3 4 5 6 7 8 9 10 11
> library(effects) > data(Arrests) > some(Arrests) 915 1568 2981 3381 3516 4128 4142 4634 4732 5183 released colour year age sex employed citizen checks No Black 2001 35 Male Yes Yes 4 Yes White 2002 21 Male Yes Yes 0 Yes White 2000 23 Male Yes Yes 2 Yes Black 1998 23 Male No Yes 2 Yes White 2002 22 Male Yes Yes 0 No White 2001 29 Male Yes Yes 1 Yes Black 1998 23 Male Yes Yes 3 Yes White 2001 18 Male Yes Yes 0 Yes White 1999 21 Male Yes Yes 3 Yes White 1999 19 Male Yes Yes 0
Response variable: released Yes, No Main predictor of interest: skin-colour of arrestee (black, white)
51 / 77
52 / 77
Eect plots
Arrests
Eect plots
Arrests

Model
Eect plots: colour

Evidence for dierent treatment of blacks and whites ( racial proling ), controlling (adjusting) for other factors
1
To allow possibly non-linear eects of year, we treat it as a factor:

1
> Arrests$year <- as.factor(Arrests$year)
> plot(effect("colour", arrests.mod), multiline = FALSE, ylab = "Probability(released)"

colour effect plot
0.88
Logistic regression model with all main eects, plus interactions of colour:year and colour:age
1 2 3 1 2 3 4 5 6 7 8 9 10 11 12 13 14
> arrests.mod <- glm(released ~ employed + citizen + checks + colour * + year + colour * age, family = binomial, data = Arrests) > Anova(arrests.mod)
Probability(released)
q
0.86
Analysis of Deviance Table (Type II tests) Response: released LR Chisq Df Pr(>Chisq) employed 72.673 1 < 2.2e-16 *** citizen 25.783 1 3.820e-07 *** checks 205.211 1 < 2.2e-16 *** colour 19.572 1 9.687e-06 *** year 6.087 5 0.2978477 age 0.459 1 0.4982736 colour:year 21.720 5 0.0005917 *** colour:age 13.886 1 0.0001942 *** --Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
53 / 77 Eect plots Arrests
0.84
0.82
0.8
Black
White
colour
54 / 77 Eect plots Arrests
Eect plots: Interactions

The story turned out to be more nuanced than reported by the Toronto Star , as shown in eect plots for interactions with colour.
1
Eect plots: Interactions

The story turned out to be more nuanced than reported by the Toronto Star , as shown in eect plots for interactions with colour.
1
> plot(effect("colour:year", arrests.mod), multiline = TRUE, ...)

colour*year effect plot
> plot(effect("colour:age", arrests.mod), multiline = TRUE, ...)

colour*age effect plot
0.88
0.86
q
0.84
Up to 2000, strong evidence for dierential treatment of blacks and whites Also evidence to support Police claim of eect of training to reduce racial eects in treatment
0.9
Opposite age eects for blacks and whites: Young blacks treated more harshly than young whites Older blacks treated less harshly than older whites
0.82
q q
0.85
0.8
0.78
colour Black White

q
0.8
0.76
colour Black White
q
1997 1998 1999 2000 2001 2002 10 20 30 40 50 60
year
55 / 77
age
56 / 77
Eect plots
Arrests
Eect plots
Arrests
Eect plots: allEects

All model eects can be viewed together using plot(allEffects(mod))
1 2 3 1 2 3 4 5 6
Eect plots: SAS

Arrests-logistic.sas proc logistic data=arrests descending; class colour year sex citizen employed; model released = colour|year colour|age sex employed citizen checks; effectplot interaction (x=year sliceby=colour) / clm alpha=0.33 noobs; effectplot slicefit (x=age sliceby=colour) / clm alpha=0.33 obs(fringe jitter); run;
> arrests.effects <- allEffects(arrests.mod, xlevels = list(age = seq(15, + 45, 5))) > plot(arrests.effects, ylab = "Probability(released)", ask = FALSE)
employed effect plot
Probability(released) Probability(released)
0.88 0.86 0.84 0.82 0.8 0.78 0.76 0.74 0.88 0.86 0.84 0.82 0.8 0.78 0.76
citizen effect plot

q
checks effect plot
0.9
0.8 0.7 0.6 0.5 0 1 2 3 4 5 6
q
No Yes
No
Yes
employed
citizen
checks
colour*year effect plot

1997 1998 1999 2000 2001 2002
colour*age effect plot

15 20 25 30 35 40 45
0.9 0.85 0.8 0.75 0.7 1997 1998 1999 2000 2001 2002
q q qq q q q q
q q q q
colour : Black
colour : White
colour : Black
0.9 0.85 0.8 0.75 15 20 25 30 35 40 45
colour : White
year
age
NB: These plots are computed at average levels of quantitative variables, but at reference levels of class variables: Sex=Male, citizen=Yes, employed=Yes
57 / 77 58 / 77 Inuence measures and diagnostic plots
Inuence measures and diagnostic plots

centroid in space of predictors

PROC LOGISTIC: printed output with the influence option
1 2
Leverage: Potential impact of an individual case distance from the Residuals: Which observations are poorly tted? Inuence: Actual impact of an individual case leverage residual
proc logistic data=arthrit descending; model better = sex treat age / influence;
C, CBAR analogs of Cooks D in OLS standardized change in regression coecients when i -th case is deleted. DIFCHISQ, DIFDEV 2 when i -th case is deleted.
6uvvrhrqhh 7iiyrvr)Dsyrpr8rssvpvr8 ( ' & ' & 8uhtrvQrh8uvThr % $ # " ! ( 6uvvrhrqhh 7iiyrvr)Dsyrpr8rssvpvr8
8uhtrvQrh8uvThr
% $ # " !

! " # $ @vhrqQihivyv % & ' (
" # $ % & ' ( GrrhtrChhyr ! " # $
Too much output, doesnt highlight unusual cases, ...

59 / 77 60 / 77

PROC LOGISTIC: plotting diagnostic measures with the plots option
1 2 3 4 5
Inuence measures and diagnostic plots: Inuence plots

The option plots(label)=dpc gives plots of 2 (DIFCHISQ, DIFDEV) vs. p Points are colored according to the inuence measure C.
proc logistic data=arthrit descending plots(only label)=(leverage dpc); class sex (ref=last) treat (ref=first) / param=ref; model better = sex treat age ; run;
The two bands of points correspond to better = {0, 1}

61 / 77 Inuence measures and diagnostic plots INFLOGIS macro Inuence measures and diagnostic plots INFLOGIS macro 62 / 77
INFLOGIS macro
Specialized version of INFLGLIM macro for logistic regression Plots a measure of change in 2 (DIFCHISQ or DIFDEV) vs. predicted probability or leverage. Bubble symbols show actual inuence (C or CBAR) Shows standard cutos for large values Flexible labeling of unusual cases
$UWKUL GDWD 7iiyrv r)DWLsyVWUHDWPHQW rpr8rssv pvr8 ( ' & % $ # " ! ! " # $ % @vhrqQihivyv & ' ( 8uhtrvQrh8uvThr ( ' & % $ # " ! " # $ % & ' ( ! " # $ GrrhtrChhyr
63 / 77 1 2 3 4 5 6 7
INFLOGIS macro: Example

logist1b.sas %include data(arthrit); %inflogis(data=arthrit, class=sex treat, y=better, x=sex treat age, id=case, gy=DIFCHISQ, gx=PRED HAT, loptions=descending); /* /* /* /* /* /* CLASS variables response predictors case ID graph ordinate graph abscissas */ */ */ */ */ */
$UWKUL GDWD 7iiyrv r)DWLsyVWUHDWPHQW rpr8rssv pvr8
8 9
8uhtrvQrh8uvThr
Printed output lists cases with large leverage, residual or inuence:

case better sex Male Male Female Female Female Female treat Treated Placebo Placebo Placebo Treated Treated age pred 27 63 31 33 58 69 .806 .807 .818 .803 .172 .108 hat difchisq difdev .09 .06 .05 .05 .03 .03 4.578 4.460 4.749 4.296 4.970 8.498 3.695 3.565 3.657 3.464 3.676 4.712 c 0.451 0.290 0.261 0.224 0.160 0.276

1 22 30 34 55 77
1 1 1 1 0 0
64 / 77
INFLOGIS macro
INFLOGIS macro

6uvvrhrqhh 7iiyrvr)Dsyrpr8rssvpvr8 ( ' & 8uhtrvQrh8uvThr % $ # " !

6uvvrhrqhh 7iiyrvr)Dsyrpr8rssvpvr8 ( ' & 8uhtrvQrh8uvThr % $ # " !

! " # $ @vhrqQihivyv % & ' (
65 / 77 Inuence measures and diagnostic plots Diagnostic plots in R
" # $ % & ' ( GrrhtrChhyr ! " # $

66 / 77 Inuence measures and diagnostic plots Diagnostic plots in R
Diagnostic plots in R
In R, plotting a glm object gives the regression quartet
arth.mod1 <- glm(Better ~ Age+Sex+Treatment,data=Arthritis, family='binomial') plot(arth.mod1)
Residuals vs Fitted
2 2
56
Diagnostic plots in R
library(car) influencePlot(arth.mod1)
Arthritis data: influencePlot
2 56 58 52 1 4
1.5
39
Studentized Residuals
0.5
Std. deviance resid.
Std. deviance resid.
Std. Pearson resid.
1.0
28
52 1 4
Residuals
0.5
28
39
28 39
0.0
Cooks distance 0.00 0.04 0.08 0.12
Normal QQ
ScaleLocation
Residuals vs Leverage
39 0.04 0.06 0.08 0.10 0.12 0.14
HatValues
67 / 77 68 / 77
The Donner Party
The Donner Party
Donner Party: A graphic tale of survival & inuence

History: AprMay, 1846: Donner/Reed families set out from Springeld, IL to CA Jul: Bridgers Fort, WY, 87 people, 23 wagons
Donner Party: A graphic tale of survival & inuence

History: Hastings Cuto , untried route through Salt Lake Desert, Wasatch Mtns. (90 people) Worst recorded winter: Oct 31 blizzard Missed by 1 day, stranded at Truckee Lake (now Donners Lake, Reno)
Rescue parties sent out ( Dire necessity , Forelorn hope , ...) Relief parties from CA: 42 survivors (MarApr, 47)
69 / 77 Inuence measures and diagnostic plots The Donner Party Inuence measures and diagnostic plots The Donner Party
70 / 77
The Donner Party: Who lived and died?

Other analyses, e.g., (Ramsay and Schafer, 1997):
Log Odds (survive) linear with Age Odds (survive | Women / survive | Men) = 4.9 (Ignored children) NAME Antoine Breen, Edward Breen, Margaret I. Breen, James Breen, John Breen, Mary Breen, Patrick Breen, Patrick Jr. Breen, Peter Breen, Simon Burger, Charles Denton, John Dolan, Patrick Donner, Elitha Cumi Donner, Eliza Poor Donner, Elizabeth Donner, Francis E. Donner, George Donner, George Jr. ... AGE 23 13 1 5 14 40 51 9 3 8 30 28 40 13 3 45 6 62 9 MALE 1 1 0 1 1 0 1 1 1 1 1 1 1 0 0 0 0 1 1 SURVIVED 0 1 1 1 1 1 1 1 1 1 0 0 0 1 1 0 1 0 1 DEATH 29DEC46 . . . . . . . . . 27DEC46 26FEB47 27DEC46 . . 14MAR47 . 18MAR47 .
71 / 77 1
Empirical logit plots

Is a linear logistic model satisfactory for these data? Discrete data often requires smoothing to see!
%logodds(data=donner, y=Died, x=Age, smooth=0.5);
1.0
0.8
Probability Died=1
0.6
0.4
0.2
0.0 0 10 20 30 Age 40 50 60 70
relation with Age is quadratic: youngest and oldest most likely to perish.
72 / 77
The Donner Party
The Donner Party
Quadratic model?
Fit: Pr(Death) Age + Age + Male Statistical evidence for Age2 equivocal:
Wald 2 (1) = 2.84, p = 0.09; but 2 LR G(1) = 4.40, p = 0.03. ... Analysis of Maximum Likelihood Estimates Parameter Variable Estimate INTERCPT AGE AGE2 MALE -1.7721 0.0168 0.00208 1.3745 Standard Wald Error Chi-Square 0.5673 0.0184 0.00123 0.5066 9.7588 0.8355 2.8439 7.3617 Pr > Chi-Square 0.0018 0.3607 0.0917 0.0067
2
Quadratic model?
Visual evidence is persuasive (but the data are thin at older ages)
1.0
0.8 Probability of Death
0.6
Men
0.4
Women 0.2
Males: exp(1.3745) = 3.95 times as likely to die, controlling for Age, Age2
0.0 0 10 20 30 40 Age 50 60 70
73 / 77 Inuence measures and diagnostic plots The Donner Party Inuence measures and diagnostic plots The Donner Party
74 / 77
Who was inuential?
Why are they inuential?

NAME Died Age M? PRED StuRes 0 0 1 1 1 51 46 45 44 47 1 1 0 0 0 .921 -2.365 .856 -2.054 .571 1.139 .541 1.183 .630 1.050 Hat DifDev .09 .08 .14 .12 .16 6.25 4.40 1.24 1.35 1.04 C 1.294 0.575 0.136 0.135 0.137
Breen, Patrick Reed, James Donner, Elizabeth Donner, Tamsen Graves, Elizabeth
Patrick Breen, James Reed: Older men who survived Elizabeth & Tamsen Donner, Elizabeth Graves: Older women who survived Moral lessons of this story:
Dont try to cross the Donner Pass in late October; if you do, bring food Plots of tted models show only what is included in the model Discrete data often need smoothing (or non-linear terms) to see the pattern Always examine model diagnostics preferably graphic
75 / 77
76 / 77
Summary: Part 4
Summary: Part 4
Logit models
Analogous to ANOVA models for a binary response Equivalent to loglinear model, including interaction of all predictors Fitting: SAS: PROC CATMOD, PROC LOGISTIC; R: glm() Visualization: plot tted logits (or probabilties) vs. factors (CATPLOT macro)
Logistic regression
Analogous to regression models for a binary response Coecients: increment to log odds / X ; exp multiplier of odds per X Discrete responses: smoothing often useful Visualization: plot tted logits (or probabilties) vs. predictors
Eect plots
Plot a main eect or interaction in the context of a more complex model Shows that eect controlling for (averaged over) all other model eects SAS: EFFPLOT macro; R: effects package
Inuence & diagnostics

Inuence plots highlight unusual cases/cells large impact on tted model Probability plots of residuals help to check model assumptions SAS: INFLGLIM macro, HALFNORM macro; R: plot(my.glm), influencePlot(my.glm)
77 / 77

Vcd4 Handout 2x2

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Vcd4 Handout 2x2

Uploaded by

Copyright:

Available Formats

Visualizing Categorical Data with SAS and R

Part 4: Model-based methods for categorical data

1 Log Odds (Admitted)

Short Course, 2012

Hazel Green Blue

Unaided distant vision data

Topics: Logit models

Logistic regression models

Low High 2 3 Low

-2.2 -5.9 Blond

Left Eye Grade

Modeling approaches: Overview

So, for admitted (i = 1) and rejected (i = 2), we have:

log m1jk log m2jk = log(m1jk /m2jk ) = log odds of admission

Fitting logit models

Logit models: Overview

(constant log odds)

= log(m1jk /m2jk ) = + jDept + (j =1) Gender

Plots for logit models

Plots for logit models

1 Log Odds (Admitted)

1 Log Odds (Admitted)

.75 Probability (Admitted)

Plots for logit models

Plots for logit models

Logit models: details

Plots for logit models: Output data set

PROC CATMOD output: Overall tests and goodness of t

Log Odds (Admitted)

no eect of Gender, except in Dept A (Females more likely admitted!)

Fitting and graphing other models

Fitting and graphing other models: details

Need to dene a dummy variable for eect of Gender in Dept. A

13 / 77 Logit models CATPLOT macro Logit models CATPLOT macro

Fitting and graphing other models:details

Fitting and graphing other models: details

Fits well! How to interpret?

Diagnostic plots for GLMs

Fitting and graphing other models: details

Diagnostic plots for Generalized Linear Models

logit(Admit) = Dept DeptA*Gender

1 Log Odds (Admitted)

INFLGLIM macro: Example

INFLGLIM macro: Example

Diagnostic plots for GLMs

Diagnostic plots for GLMs

Diagnostic plots for Generalized Linear Models

FMRej BMRej 0 BMAdm

AFRej 0.4 0.5 0.6 0.7 HatValues 0.8

21 / 77 Logit models Diagnostic plots for GLMs Logistic regression models

5 AFAbsolute Std Deviance Residual 4 AM-AM+ AF+

Logistic regression models

0 0 1 2 3 Expected value of half normal quantile

Logistic regression models

Logistic regression models

Logistic regression models: Binary response

Logistic regression models: Binary response

Probit (normal transform) model similar predictions, but is less interpretable

Logistic regression models: Binary response

Logistic regression models: Binary response

so, increasing xij by 1 increases logit(i ) by j , and multiplies the odds by e j .

Logistic regression models

Visualizing logistic models

Logistic regression models

separate curves / separate panels / placebo / treated /