Professional Documents
Culture Documents
(GEEs)
Purpose: to introduce GEEs
These are used to model correlated data
from
Longitudinal/ repeated measures studies
Clustered/ multilevel studies
Outline
Examples of correlated data
Successive generalizations
Normal linear model
Generalized linear model
GEE
Estimation
Example: stroke data
exploratory analysis
modelling
2
Correlated data
1. Repeated measures: same subjects, same measure,
successive times expect successive measurements to be correlated
Treatment groups
Measurement times
A
Subjects,
i = 1,,n
C
Randomize
Yi1
Yi2
Yi3
Yi4
3
Correlated data
2. Clustered/multilevel studies
Level 3
Level 2
Level 1
E.g., Level 3: populations
Level 2: age - sex groups
Level 1: blood pressure measurements in sample of people in each
age - sex group
We expect correlations within populations and within age-sex groups due
to genetic, environmental and measurement effects
Notation
Repeated measurements: yij, i = 1, N, subjects;
j = 1, ni, times for subject i
Clustered data: yij, i = 1, N, clusters; j = 1, ni,
measurements within cluster
yii1
y
Use unit for subject or cluster
i2
Vector of measurements for unit i yi
M
yin i
y1
y
Vector of measurements for all units y 2
M
yN
5
E(yi)= i=Xi;
yi~N( i, Vi)
2 ,
X 2 ,
V 0
0
VN
N
XN
This V is suitable if the units are independent
6
Xi Vi1( yi X
i ) 0
T
T 1
D
i Vi ( yi i ) 0
D V
T
i
where Vi ( A R i A
1/ 2
i
1/ 2
i
( yi i ) 0
10
Overdispersion parameter
Estimated using the formula:
1
y
ij ij
i j
Np
var( ij )
Where N is the total number of measurements and
p is the number of regression parameters
The square root of the overdispersion parameter
is called the scale parameter
11
Estimation (1)
For Normal linear model
Solve U() =
T
( X T X )1 X T y
X
(
y
0
to
get
i i i
with var( ) = (X T V 1X )1
More generally, unless Vi is known, need iteration to
T
1
Di Vi (y i i ) 0
solve U ()
and
13
i g ( Xi )
1
And residuals:
Yi i
Correlation
For unit i
Vi = 2 21
M
n1
12
1
O
..
1n
..
15
Types of correlation
1. Independent: Vi is diagonal
2. Exchangeable: All measurements on the same
unit are equally correlated
lm
Plausible for clustered data
Other terms: spherical and compound symmetry
16
Types of correlation
3. Correlation depends on time or distance between
measurements l and m
Missing Data
For missing data, can estimate the working
correlation using the all available pairs
method, in which all non-missing pairs of
data are used in the estimators of the
working correlation parameters.
18
19
2
2
e ;
E(Yij) = ; var(Yij) = u
u2
cov(Yij,Ykm)= , provided i=k, cov(Yij,Ykm)=0, otherwise.
u2
2
2
u
e =ICC
So V is exchangeable with elements:
i
24
week
8
19
25
A:blue
B: black
70
60
C: red
50
40
30
2
4
week
26
Week1
Week2
Week3
Week4
Week5
Week6
Week7
Week8
28
Numerical example
Correlation matrix
week
0.93
0.88
0.92
0.83
0.88 0.95
0.79
0.71
0.97
0.62
0.92 0.96
0.55
Numerical example
1. Pooled analysis ignoring correlation
within patients
Y
ijk j k
j for
; groups, k for time
ijk
Numerical example
2. Data reduction
Fit a straight line for each patient
Y
ij e ijk
ijk ij k
assume independence and constant variance
use simple linear regression to estimate ij and ij
Perform ANOVA using estimates ij as data
and groups as levels of a factor in order to compare j ' s.
Repeat ANOVA using ij's as data and compare j's
31
Numerical example
2. Repeated measures analyses using
various variance-covariance structures
Fit Y
ijk j k
ijk
Numerical example
4. Mixed/Random effects model
Use model
Yijk = (j + aij) + (j + bij)k + eijk
j and j are fixed effects for groups
(ii) other effects are random
Asymp SE
Robust SE
Pooled
29.821
5.772
Data reduction
29.821
7.572
GEE, independent
29.821
5.683
10.395
GEE, exchangeable
29.821
7.047
10.395
GEE, AR(1)
33.492
7.624
9.924
GEE, unstructured
30.703
7.406
10.297
Random effects
29.821
7.047
34
Asymp SE
Robust SE
Pooled
3.348
8.166
Data reduction
3.348
10.709
GEE, independent
3.348
8.037
11.884
GEE, exchangeable
3.348
9.966
11.884
GEE, AR(1)
-0.270
10.782
11.139
GEE, unstructured
2.058
10.474
11.564
Random effects
3.348
9.966
35
Asymp SE
Robust SE
Pooled
-0.022
8.166
Data reduction
-0.018
10.709
GEE, independent
-0.022
8.037
11.130
GEE, exchangeable
-0.022
9.966
11.130
GEE, AR(1)
-6.396
10.782
10.551
GEE, unstructured
-1.403
10.474
10.906
Random effects
-0.022
9.966
36
Asymp SE
Robust SE
Pooled
6.324
1.143
Data reduction
6.324
1.080
GEE, independent
6.324
1.125
1.156
GEE, exchangeable
6.324
0.463
1.156
GEE, AR(1)
6.074
0.740
1.057
GEE, unstructured
7.126
0.879
1.272
Random effects
6.324
0. 463
37
Asymp SE
Robust SE
Pooled
-1.994
1.617
Data reduction
-1.994
1.528
GEE, independent
-1.994
1.592
1.509
GEE, exchangeable
-1.994
0.655
1.509
GEE, AR(1)
-2.142
1.047
1.360
GEE, unstructured
-3.556
1.243
1.563
Random effects
-1.994
0.655
38
Asymp SE
Robust SE
Pooled
-2.686
1.617
Data reduction
-2.686
1.528
GEE, independent
-2.686
1.592
1.502
GEE, exchangeable
-2.686
0.655
1.509
GEE, AR(1)
-2.236
1.047
1.504
GEE, unstructured
-4.012
1.243
1.598
Random effects
-2.686
0.655
39
Numerical example:
Summary of results
All models produced similar results leading to the same
conclusion no treatment differences
Pooled analysis and data reduction are useful for
exploratory analysis easy to follow, give good
approximations for estimates but variances may be
inaccurate
Random effects models give very similar results to GEEs
dont need to specify variance-covariance matrix
model specification may/may not be more natural
40