AERA Overheads

1
An Introduction to HLM with R

Dr. J. Kyle Roberts
Introduction to Hierarchical Linear
Modeling with R
-10
0
10
20
30
40
5 10 15 20 25
1 2
5 10 15 20 25
3 4
5 6 7
-10
0
10
20
30
40
8
-10
0
10
20
30
40
9 10 11 12
13
5 10 15 20 25
14 15
-10
0
10
20
30
40
5 10 15 20 25
16
URBAN
S
C
I
E
N
C
E

2
Dr. J. Kyle Roberts
First Things First
Robinson (1950) and the problem of
contextual effects
The Frog-Pond Theory
Pond
A
Pond
B
3
Dr. J. Kyle Roberts
A Brief History of Multilevel Models
Nested ANOVA designs
Problems with the ANCOVA design
Do schools differ vs. Why schools
differ?
ANCOVA does not correct for intra-class
correlation (ICC)
4
Dr. J. Kyle Roberts
Strengths of Multilevel Models
Statistical models that are not hierarchical
sometimes ignore structure and report
underestimated standard errors
Multilevel techniques are more efficient than
other techniques
Multilevel techniques assume a general linear
model and can perform all types of analyses
5
Dr. J. Kyle Roberts
Multilevel Examples
Students nested within classrooms
Students nested within schools
Students nested within classrooms within schools
Measurement occasions nested within subjects (repeated
measures)
Students cross-classified by school and neighborhood
Students having multiple membership in schools
(longitudinal data)
Patients within a medical center
People within households
6
Dr. J. Kyle Roberts
Children Nested In Families!!
7
Dr. J. Kyle Roberts
Do we really need HLM/MLM?
All data are multilevel!
The problem of independence of
observations
The inefficiency of OLS techniques
8
Dr. J. Kyle Roberts
Differences in HLM and Other
Methods
HLM is based on Maximum Likelihood and Empirical
Bayesian estimation techniques
1 + 1 = 1.5
X
2
X
3
X
1
X
00
02
01
03
9
Dr. J. Kyle Roberts
Graphical Example of Multilevel ANOVA
10
Dr. J. Kyle Roberts
Notating the HLM ANOVA
The full model would be:
Level-1 model is:
Level-2 model is:
ij j ij
e u y + + =
0 00
ij j ij
e y + =
0
j j
u
0 00 0
+ =
+ =
+ =
+ =
ij j ij
e y
e y
e y
L
21 1 21
11 1 11
+ =
+ =
+ =
j j
u
u
u
00
2 00 2
1 00 1

L
11
Dr. J. Kyle Roberts
Understanding Errors
2
X
1
X
X
01
u
02
u
2
0 u
11
X
21
X
12
X
22
X
11
e
21
e
22
e
12
e
2
e
12
Dr. J. Kyle Roberts
Fixed vs. Random Coefficients
Fixed Slopes and
Intercepts
Fixed Level-1 and Level-2 Coefficie
School Urbanicity
4.0 3.5 3.0 2.5 2.0 1.5 1.0 .5 0.0
S
c
i
e
n
c
e

A
c
h
i
e
v
e
m
e
n
t
4.0
3.5
3.0
2.5
2.0
1.5
1.0
.5
0.0
Prescription Level
A
c
t
i
n
g

O
u
t
13
Dr. J. Kyle Roberts
Fixed Slopes and
Intercepts
Random Intercepts and
Fixed Slopes
Random Level-1 and Fixed Level-2
School Urbanicity
4.0 3.0 2.0 1.0 0.0
S
c
i
e
n
c
e

A
c
h
i
e
v
e
m
e
n
t
4.0
3.0
2.0
1.0
0.0
Prescription Level
A
c
t
i
n
g

O
u
t
14
Dr. J. Kyle Roberts
Fixed Slopes and
Intercepts
Fixed Slopes
Fixed Intercepts and
Random Slopes
Fixed Level-1 and Random Level-2
School Urbanicity
4 3 2 1 0
S
c
i
e
n
c
e

A
c
h
i
e
v
e
m
e
n
t
4.0
3.0
2.0
1.0
0.0
Prescription Level
A
c
t
i
n
g

O
u
t
15
Dr. J. Kyle Roberts
Fixed Slopes and
Intercepts
Fixed Slopes
Fixed Intercepts and
Random Slopes
Random Slopes and
Intercepts
Random Level-1 and Random Level-2
School Urbanicity
4 3 2 1 0
S
c
i
e
n
c
e

A
c
h
i
e
v
e
m
e
n
t
4.0
3.0
2.0
1.0
0.0
Prescription Level
A
c
t
i
n
g

O
u
t
16
Dr. J. Kyle Roberts
Lets Give This A Shot!!!
An example where we use a childs level of
urbanicity (a SES composite) to predict their
science achievement
Start with Multilevel ANOVA (also called the
null model)
ij j ij
r u science + + =
0 00
Grand mean Group deviation Individual diff.

17
Dr. J. Kyle Roberts
Intraclass Correlation
The proportion of total variance that is between the groups
of the regression equation
The degree to which individuals share common
experiences due to closeness in space and/or time Kreft &
de Leeuw, 1998.
a.k.a ICC is the proportion of group-level variance to the
total variance
LARGE ICC DOES NOT EQUAL LARGE
DIFFERENCES BETWEEN MLM AND OLS (Roberts,
2002)
Formula for ICC:
2
0
2
0
2
0
e u
u

+
=
18
Dr. J. Kyle Roberts
Statistical Significance???
Chi-square vs. degrees of freedom in
determining model fit
The problem with the df
Can also compute statistical significance of
variance components (only available in
some packages)
19
Dr. J. Kyle Roberts
The Multilevel Model Adding a
Level-1 Predictor
Consider the following 1-level regression
equation:
y= a + bx + e
y = response variable
a = intercept
b = coefficient of the response variable (slope)
x = response variable
e = residual or error due to measurement
20
Dr. J. Kyle Roberts
The Multilevel Model (2)
The fixed coefficients multilevel model is a slight
variation on the OLS regression equation:
y
ij
= a + bx
ij
+ u
j
+ e
ij
Where i defines level-1, j defines level-2, u
j
is the level-2
residual and e
ij
is the level-1 residual
Using slightly different annotation we can transform
the above equation to:
y
ij
=
00
+
10
x
ij
+ u
0j
+ e
ij
Where
00
now defines the constant/intercept a and
10
defines the
slope
21
Dr. J. Kyle Roberts
The Multilevel Model (3)
From the previous model:
y
ij
=
00
+
10
x
ij
+ u
0j
+ e
ij
We can then transform this model to:
y
ij
=
0j
+
1
x
1ij
+e
ij
Level-1 Model

0j
=
00
+ u
0j
Level-2 Model

1j
=
10
With variances
2
0 0 u j
u =
2
ij
e ij
e =
22
Dr. J. Kyle Roberts
Understanding Errors
ij
e
j
u
0
2
0 u
23
Dr. J. Kyle Roberts
Adding a Random Slope Component
Suppose that we have good reason to assume that
it is inappropriate to force the same slope for
urbanicity on each school
Level-1 Model y
ij
=
0j
x
0
+
1j
x
1ij
+ r
ij
Level-2 Model
0j
=
00
+ u
0j
1j
=
10
+ u
1j
Complete Model
ij j j ij
r urban u u science + + + + = ) (
1 10 0 00

24
Dr. J. Kyle Roberts
Understanding Errors Again
2
1 u
????
25
Dr. J. Kyle Roberts
Model Fit Indices
Chi-square
Akaike Information Criteria
Bayesian Informaiton Criteria
l * 2
K AIC 2 * 2 + = l
) ( * * 2 N Ln K BIC + = l
26
Dr. J. Kyle Roberts
To Center or Not to Center
In regression, the intercept is interpreted as the expected
value of the outcome variable, when all explanatory variables
have the value zero
However, zero may not even be an option in our data (e.g.,
Gender)
This will be especially important when looking at cross-level
interactions
General rule of thumb: If you are estimating cross-level
interactions, you should grand mean center the explanatory
variables.
27
Dr. J. Kyle Roberts
An Introduction to R
28
Dr. J. Kyle Roberts
R as a Big Calculator
Language was originally developed by AT&T Bell
Labs in the 1980s
Eventually acquired by MathSoft who
incorporated more of the functionality of large
math processors
The commands window is like a big calculator
> 2+2
[ 1] 4
> 3*5+4
[ 1] 19
29
Dr. J. Kyle Roberts
Object Oriented Language
A Big Plus for R is that it utilizes object oriented
language.
> x<- 2+4
> x
[ 1] 6
> y<- 3+5
> y
[ 1] 8
> x+y
[ 1] 14
> x<- 1: 10
> x
[ 1] 1 2 3 4 5 6
7 8 9 10
> mean( x)
[ 1] 5. 5
> 2*x
[ 1] 2 4 6 8 10 12 14 16
18 20
> x^2
[ 1] 1 4 9 16 25 36
49 64 81 100
30
Dr. J. Kyle Roberts
Utilizing Functions in R
R has many built in functions (c.f., Language
Reference in the Help menu)
Functions are commands that contain arguments
seq function has 4 arguments
seq( f r om, t o, by, l engt h. out ,
al ong. wi t h)
> ?seq
> seq( f r om=1, t o=100, by=10)
[ 1] 1 11 21 31 41 51 61 71
81 91
> seq( 1, 100, 10)
> seq( 1, by=10,
l engt h=4)
[ 1] 1 11 21 31
31
Dr. J. Kyle Roberts
Making Functions in R
> squar ed<- f unct i on( x) {x^2}
> squar ed( 5)
[ 1] 25
> i nver se<- f unct i on( x) {1/ x}
> num<- c( 1, 2, 3, 4, 5)
> i nver se( num)
[ 1] 1. 0000000 0. 5000000 0. 3333333
0. 2500000 0. 2000000
32
Dr. J. Kyle Roberts
Sampling from a Single Variable
The sampl e function is used to draw a random
sample from a single vector of scores
sampl e( x, si ze, r epl ace, pr ob)
x = dataset
si ze = size of the sample to be drawn
r epl ace = toggles on and off sampling with
replacement (default = F)
pr ob = vector of probabilities of same length as x
33
Dr. J. Kyle Roberts
Sampling a Vector (cont.)
> x<- 1: 30
> sampl e( x, 10, r epl ace=T)
[ 1] 8 14 27 2 30 16 4 9 9 2
> x<- 1: 5
> sampl e( x, 10, r epl ace=T)
[ 1] 3 2 2 3 3 4 1 1 2 3
34
Dr. J. Kyle Roberts
Rcmdr library(Rcmdr)
35
Dr. J. Kyle Roberts
Reading a Dataset
> exampl e<- r ead. t abl e( f i l e, header =T)
> exampl e<- r ead. t abl e( c: / aer a/ exampl e. t xt , header =T)
> head( exampl e)
36
Dr. J. Kyle Roberts
Science Achievement/Urbanicity
Back to the Multilevel ANOVA example, lets perform an OLS using
urbanicity to predict science achievement
> summar y( l m( SCI ENCE~URBAN, exampl e) )
Cal l :
l m( f or mul a = SCI ENCE ~ URBAN, dat a = exampl e)
Resi dual s:
Mi n 1Q Medi an 3Q Max
- 5. 3358 - 2. 1292 0. 4919 2. 0432 5. 0090
Coef f i ci ent s:
Est i mat e St d. Er r or t val ue Pr ( >| t | )
( I nt er cept ) - 1. 25108 0. 59371 - 2. 107 0. 0367 *
URBAN 0. 82763 0. 03863 21. 425 <2e- 16 ***
- - -
Si gni f . codes: 0 ' ***' 0. 001 ' **' 0. 01 ' *' 0. 05 ' . ' 0. 1 ' ' 1
Resi dual st andar d er r or : 2. 592 on 158 degr ees of f r eedom
Mul t i pl e R- Squar ed: 0. 7439, Adj ust ed R- squar ed: 0. 7423
F- st at i st i c: 459 on 1 and 158 DF, p- val ue: < 2. 2e- 16
r urban y + + = ) ( 83 . 0 25 . 1
37
Dr. J. Kyle Roberts
Rs Functionality
Try these:
> l i near <- l m( SCI ENCE~URBAN, exampl e)
> summar y( l i near )
> pl ot ( SCI ENCE~URBAN, exampl e)
> abl i ne( l i near )
> pl ot ( l i near )
> names( l i near )
> l i near $coef f i ci ent s
38
Dr. J. Kyle Roberts
lmer l i br ar y( l me4)
39
Dr. J. Kyle Roberts
Running lmer for example
> f m. nul l <- l mer ( SCI ENCE~1 + ( 1| GROUP) , exampl e)
> summar y( f m. nul l )
Li near mi xed- ef f ect s model f i t by REML
For mul a: SCI ENCE ~ 1 + ( 1 | GROUP)
Dat a: exampl e
AI C BI C l ogLi k MLdevi ance REMLdevi ance
641. 9 648 - 318. 9 640. 2 637. 9
Randomef f ect s:
Gr oups Name Var i ance St d. Dev.
GROUP ( I nt er cept ) 25. 5312 5. 0528
Resi dual 1. 9792 1. 4068
number of obs: 160, gr oups: GROUP, 16
Fi xed ef f ect s:
Est i mat e St d. Er r or t val ue
( I nt er cept ) 10. 687 1. 268 8. 428
Notice: No p-values!!
ij j
ij j
e u SCIENCE
e u SCIENCE
+ + =
+ + =
0
0 00
687 . 10
40
Dr. J. Kyle Roberts
bwpl ot ( GROUP~r esi d( f m. nul l ) , exampl e)
41
Dr. J. Kyle Roberts
Examining Empirical Bayesian
Estimated Group Means
> wi t h( exampl e, t appl y( SCI ENCE, GROUP, mean) )
> coef ( f m. nul l )
42
Dr. J. Kyle Roberts
Adding our Urban Predictor
> f m1<- l mer ( SCI ENCE~URBAN + ( 1| GROUP) , exampl e)
> summar y( f m1)
For mul a: SCI ENCE ~ URBAN + ( 1 | GROUP)
Dat a: exampl e
506. 1 515. 3 - 250. 0 499. 4 500. 1
Randomef f ect s:
Gr oups Name Var i ance St d. Dev.
GROUP ( I nt er cept ) 86. 45595 9. 29817
Resi dual 0. 65521 0. 80945
Fi xed ef f ect s:
( I nt er cept ) 22. 3029 2. 4263 9. 192
URBAN - 0. 8052 0. 0480 - 16. 776
Cor r el at i on of Fi xed Ef f ect s:
( I nt r )
URBAN - 0. 285
43
Dr. J. Kyle Roberts
coef ( f m1)
An obj ect of cl ass " coef . l mer "
[ [ 1] ]
( I nt er cept ) URBAN
1 7. 359555 - 0. 8052278
2 8. 519721 - 0. 8052278
3 11. 530509 - 0. 8052278
4 13. 656217 - 0. 8052278
5 16. 425619 - 0. 8052278
6 19. 195021 - 0. 8052278
7 19. 631031 - 0. 8052278
8 22. 078586 - 0. 8052278
9 23. 721524 - 0. 8052278
10 24. 479381 - 0. 8052278
11 26. 524627 - 0. 8052278
12 28. 087102 - 0. 8052278
13 29. 810501 - 0. 8052278
14 31. 936209 - 0. 8052278
15 35. 641243 - 0. 8052278
16 38. 249722 - 0. 8052278
44
Dr. J. Kyle Roberts
Comparing Models
> anova( f m. nul l , f m1)
Dat a: exampl e
Model s:
f m. nul l : SCI ENCE ~ 1 + ( 1 | GROUP)
f m1: SCI ENCE ~ URBAN + ( 1 | GROUP)
Df AI C BI C l ogLi k Chi sq Chi Df Pr ( >Chi sq)
f m. nul l 2 644. 17 650. 32 - 320. 08
f m1 3 505. 37 514. 60 - 249. 69 140. 79 1 < 2. 2e- 16 ***
- - -
Si gni f . codes: 0 ' ***' 0. 001 ' **' 0. 01 ' *' 0. 05 ' . ' 0. 1 ' ' 1
45
Dr. J. Kyle Roberts
Graphing
wi t h( f m1, {
cc <- coef ( . ) $GROUP
xypl ot ( SCI ENCE ~ URBAN | GROUP,
i ndex. cond = f unct i on( x, y) coef ( l m( y ~ x) ) [ 1] ,
panel = f unct i on( x, y, gr oups, subscr i pt s, . . . ) {
panel . gr i d( h = - 1, v = - 1)
panel . poi nt s( x, y, . . . )
subj <- as. char act er ( GROUP[ subscr i pt s] [ 1] )
panel . abl i ne( cc[ subj , 1] , cc[ subj , 2] )
})
})
46
Dr. J. Kyle Roberts
Adding a Random Coefficient
> f m2<- l mer ( SCI ENCE~URBAN + ( URBAN| GROUP) , exampl e)
> summar y( f m2)
For mul a: SCI ENCE ~ URBAN + ( URBAN | GROUP)
Dat a: exampl e
422. 2 437. 5 - 206. 1 413. 2 412. 2
Randomef f ect s:
Gr oups Name Var i ance St d. Dev. Cor r
GROUP ( I nt er cept ) 113. 65372 10. 66085
URBAN 0. 25204 0. 50204 - 0. 626
Resi dual 0. 27066 0. 52025
Fi xed ef f ect s:
( I nt er cept ) 22. 3913 2. 7176 8. 239
URBAN - 0. 8670 0. 1298 - 6. 679
Cor r el at i on of Fi xed Ef f ect s:
( I nt r )
URBAN - 0. 641
47
Dr. J. Kyle Roberts
coef ( f m2)
An obj ect of cl ass " coef . l mer "
[ [ 1] ]
( I nt er cept ) URBAN
1 7. 038437 - 0. 7468619
2 8. 901233 - 0. 8742708
3 11. 668557 - 0. 8227891
4 15. 130206 - 0. 9607240
5 16. 185638 - 0. 7849262
6 26. 029030 - 1. 2969941
7 24. 550979 - 1. 1780460
8 24. 894060 - 0. 9929589
9 31. 570587 - 1. 3020201
10 26. 967287 - 0. 9657138
11 30. 982799 - 1. 0705419
12 36. 360597 - 1. 2779387
13 31. 267775 - 0. 8843021
14 41. 202393 - 1. 2731132
15 12. 723743 0. 2710856
16 12. 787665 0. 2879866
48
Dr. J. Kyle Roberts
Graphing
wi t h( f m2, {
cc <- coef ( . ) $GROUP
xypl ot ( SCI ENCE ~ URBAN | GROUP,
i ndex. cond = f unct i on( x, y) coef ( l m( y ~ x) ) [ 1] ,
panel = f unct i on( x, y, gr oups, subscr i pt s, . . . ) {
panel . gr i d( h = - 1, v = - 1)
panel . poi nt s( x, y, . . . )
subj <- as. char act er ( GROUP[ subscr i pt s] [ 1] )
panel . abl i ne( cc[ subj , 1] , cc[ subj , 2] )
})
})
49
Dr. J. Kyle Roberts
Comparing Models Again
> anova( f m. nul l , f m1, f m2)
Dat a: exampl e
Model s:
f m. nul l : SCI ENCE ~ 1 + ( 1 | GROUP)
f m1: SCI ENCE ~ URBAN + ( 1 | GROUP)
f m2: SCI ENCE ~ URBAN + ( URBAN | GROUP)
Df AI C BI C l ogLi k Chi sq Chi Df Pr ( >Chi sq)
f m. nul l 2 644. 17 650. 32 - 320. 08
f m1 3 505. 37 514. 60 - 249. 69 140. 79 1 < 2. 2e- 16 ***
f m2 5 423. 22 438. 60 - 206. 61 86. 15 2 < 2. 2e- 16 ***
- - -
Si gni f . codes: 0 ' ***' 0. 001 ' **' 0. 01 ' *' 0. 05 ' . ' 0. 1 ' ' 1
50
Dr. J. Kyle Roberts
Comparing Three Models
Which Model is Best?
51
Dr. J. Kyle Roberts
OLS vs. EB Random Estimates
Notice that estimates that are further away from our grand
slope estimate (-0.87) are shrunk further back to the mean.
52
Dr. J. Kyle Roberts
R
2
in HLM
2
|
2
|
2
|
2
1
b e
m e b e
R

=
2
| 0
2
| 0
2
| 0
2
2
b u
m u b u
R

=
Level-1 Equation Level-2 Equation
863 .
979 . 1
271 . 0 979 . 1
2
1
=
= R 386 . 2
531 . 25
456 . 86 531 . 25
2
2
=
= R
??? -238% Variance Explained???
53
Dr. J. Kyle Roberts
Distributional Assumptions
Level-1 errors are independent and identically normally distributed with a
mean of 0 and a variance in the population
> qqmat h( ~r esi d( f m2) )
2
54
Dr. J. Kyle Roberts
Distributional Assumptions
Level-1 predictor is independent of Level-1 errors
> wi t h( f m2, xypl ot ( r esi d( . ) ~ f i t t ed( . ) ) )
55
Dr. J. Kyle Roberts
An Example for Homework
http://www.hlm-online.com/datasets/education/
Look at Dataset 2 (download dataset from online)
Printout is in your packet
56
Dr. J. Kyle Roberts
Other Software Packages for HLM
Analysis
Good Reviews at http://www.cmm.bristol.ac.uk/
MLwiN
SAS PROC MIXED, PROC NLMIXED, GLIMMIX
S-PLUS lme, nlme, glme
Mplus
SPSS Version 12 and later
STATA

AERA Overheads

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AERA Overheads

Uploaded by

Copyright:

Available Formats

1

An Introduction to HLM with R

Grand mean Group deviation Individual diff.

You might also like