Professional Documents
Culture Documents
In Univariate Time Series models we attempt to predict a variable using only information
contained in its past values. (i.e. let the data speak for themselves)
Non-stationarity in autocorrelations
well as in variance
A driftless r andom walk Xt=Xt+N(0,9)
50
40
30
20
10
0
-10
-20
-30
-40
250
500
750
1000
Non-Stationarity in autocorrelations
well as in mean and variance
A r a n d o m w a lk w ith d r ift X t= 0 .2 + x t( - 1 ) + N ( 0 ,9 )
240
200
160
120
80
40
0
250
500
750
1000
Source: Mukherjee et al(1998). Econometrics and data analysis for developing countries
So if the process is covariance stationary, all the variances are the same and all
the covariances depend on the difference between t1 and t2. The moments
E ( yt E ( yt ))( yt + s E ( yt + s )) = s , s = 0,1,2, ...
are known as the covariance function.
The covariances, s, are known as autocovariances.
However, the value of the autocovariances depend on the units of measurement
of yt.
It is thus more convenient to use the autocorrelations which are the
autocovariances normalised by dividing by the variance:
= s , s = 0,1,2, ...
s
12
We can also test the joint hypothesis that all m of the k correlation coefficients
are simultaneously equal to zero using the Q-statistic developed by Box and
m
Pierce:
Q = T k2
k =1
Q = T (T + 2 )
k =1
k2
T k
~ m2
An ACF Example
Question:
Suppose that a researcher had estimated the first 5 autocorrelation coefficients
using a series of length 100 observations, and found them to be (from 1 to 5):
0.207, -0.013, 0.086, 0.005, -0.022.
Test each of the individual coefficient for significance, and use both the BoxPierce and Ljung-Box tests to establish whether they are jointly significant.
Solution:
A coefficient would be significant if it lies outside (-0.196,+0.196) at the 5%
level, so only the first autocorrelation coefficient is significant.
Q=5.09 and Q*=5.26
Compared with a tabulated 2(5)=11.1 at the 5% level, so the 5 coefficients
are jointly insignificant. [p-val=1-@cchisq(5.09,5)=0.595]
14
Some economic hypothesis lead to moving average time series structure. Changes in
price of a stock from day 1 to next day behave as a series of uncorrelated random
variables with zero mean and constant variance
i.e. y t = Pt Pt 1 + u t , t = 1, 2, . . ., T
[ ut is uncorrelated random variable]
Random component ut reflects unexpected news e.g. new information about financial
health of a corporation, popularity of the product suddenly rises or falls (due to reports
of desirable or undesirable effects), emergence of a new competitors, revelation of
management scandal etc.
But suppose that full impact of any unexpected news is not completely absorbed by the
market in one day. Then the price change next day might be y t +1 = u t +1 + u t
Where ut +1is the effect of new information received during day t+1 andu reflects
t
the continuing assessment of day t news.
The equation above is a moving average process. The value of economic variable
is yt +1 a weighted combination of current and past period random disturbances.
15
( s + s +1 1 + s + 2 2 + ... + q q s ) 2
s =
0 for s > q
for
s = 1,2,..., q
16
Example of an MA Problem
17
Solution
(i) If E(ut)=0, then E(ut-i)=0 i.
So
E(Xt) = E(ut + 1ut-1+ 2ut-2)= E(ut)+ 1E(ut-1)+ 2E(ut-2)=0 (why ?)
Var(Xt)
but E(Xt)
Var(Xt)
= E[Xt-E(Xt)][Xt-E(Xt)]
= 0, so
= E[(Xt)(Xt)]
= E[(ut + 1ut-1+ 2ut-2)(ut + 1ut-1+ 2ut-2)]
= E[ u t2 + 12 u t21 + 22 u t2 2 +cross-products]
(why?)
18
Solution (contd)
So Var(Xt) = 0= E [ u t + 1 u t 1 + 2 u t 2 ]
2
2
2
2
2
= +1 + 2
(why?)
2
2
2
= (1 + 1 + 2 )
2
19
Solution (contd)
2
= E[Xt-E(Xt)][Xt-3-E(Xt-3)]
= E[Xt][Xt-3]
= E[(ut +1ut-1+2ut-2)(ut-3 +1ut-4+2ut-5)]
=0
So s = 0 for s > 2.
20
Solution (contd)
We have the autocovariances, now calculate the autocorrelations:
0 = 0 = 1
0
( 1 + 1 2 ) 2
1
( 1 + 1 2 )
1 =
=
=
0 (1 + 12 + 22 ) 2 (1 + 12 + 22 )
( 2 ) 2
2
2
2 =
=
=
0 (1 + 12 + 22 ) 2 (1 + 12 + 22 )
3 = 3 = 0
0
s = s = 0 s > 2
0
(iii) For 1 = -0.5 and 2 = 0.25, substituting these into the formulae above
gives 1 = -0.476, 2 = 0.190.
21
ACF Plot
Thus the ACF plot will appear as follows:
1.2
1
0.8
0.6
acf
0.4
0.2
0
0
-0.2
-0.4
-0.6
22
Autoregressive Processes
Economic activity takes time to slow down and speed up. There is a
built in inertia in economic series. A simple process that characterize
this process is the first order autoregressive process
yt = + 1 yt 1 + ut
23
Autoregressive Processes
y t = + 1 y t 1 + 2 y t 2 + ... + p y t p + u t
y t = + i y t i + u t
i =1
i
or y t = + i L y t + u t
i =1
or ( L) y t = + u t
where
( L) = 1 (1 L + 2 L2 +... p Lp ) .
24
The condition for stationarity of a general AR(p) model is that the roots of
polynomial
2
p
lag
1 1 L 2 L ... p L = 0
all lie outside the unit circle i.e. have their absolute value greater than one.
1 1.2 L + 0.32 L2 = 0
0.32 L2 1.2 L + 1 = 0
Characteristic roots are 2.5 and 1.25 both outside the unit circle, the process is
stationary.
25
States that any stationary series can be decomposed into the sum of two
unrelated processes, a purely deterministic part and a purely stochastic
part, which will be an MA().
y t = ( L)u t
where,
( L) = (1 1 L 2 L2 ... p Lp ) 1
26
Sample AR Problem
yt = + 1 yt 1 + ut
(i) Calculate the (unconditional) mean of yt.
For the remainder of the question, set =0 for simplicity.
(ii) Calculate the (unconditional) variance of yt.
(iii) Derive the autocorrelation function for yt.
27
Solution
( y
28
Solution (contd)
yt = (1 1 L) 1 ut
yt = (1 + 1 L + 1 L2 + ...)ut
2
yt = ut + 1ut 1 + 1 ut 2 + ...
2
Var(yt) = E[yt-E(yt)][yt-E(yt)]
but E(yt) = 0, since we are setting = 0.
Var(yt) = E[(yt)(yt)]
29
Solution (contd)
Var(yt)=E
=E
(u
=E (ut
=
)(
+ 1u t 1 + 1 u t 2 + .. u t + 1u t 1 + 1 u t 2 + ..
+ 1 ut 1 + 1 ut 2 + ...)]
2
u2 + 12 u2 + 14 u2 + ...
u2 (1 + 12 + 14 + ...)
u2
(1 12 )
30
Solution (contd)
the
1 2 + 13 2 + 15 2 + ...
1 2
=
(1 12 )
(make a bivariate table for understanding product of brackets)
31
Solution (contd)
For the second autocorrelation coefficient,
2 = Cov(yt, yt-2) = E[yt-E(yt)][yt-2-E(yt-2)]
Using the same rules as applied above for the lag 1 covariance
2 = E[ytyt-2]
2
2
= E[(ut + 1ut 1 + 1 ut 2 + ...)(u t 2 + 1u t 3 + 1 u t 4 + ...) ]
= E[ 1 2 u t 2 2 + 1 4 u t 3 2 + ... + cross products]
= 12 2 + 14 2 + ...
2
2
2
4
= 1 (1 + 1 + 1 + ...)
=
12 2
(1 12 )
32
Solution (contd)
13 2
(1 12 )
s =
1s 2
(1 12 )
Solution (contd)
0
=1
0 =
0
1 2
(
1
1 )
1
= 1
1 = =
0
2
(
1
1 )
2 =
2
0
2 2
1
(
1
1 )
=
= 12
(
1
1 )
3
3 = 1
s = 1s
34
So kk measures the correlation between yt and yt-k after removing the effects
of yt-k+1 , yt-k+2 , , yt-1 .
35
The pacf is useful for telling the difference between an AR process and an
ARMA process.
In the case of an AR(p), there are direct connections between yt and yt-s only
for s p.
In the case of an MA(q), this can be written as an AR(), so there are direct
connections between yt and all its previous values.
ARMA Processes
y t = + 1 y t 1 + 2 y t 2 + ... + p y t p + 1u t 1 + 2 u t 2 + ... + q u t q + u t
2
2
with E (u t ) = 0; E (u t ) = ; E (u t u s ) = 0, t s
37
1 1 2 ...p
38
39
40
41
42
10
-0.05
-0.1
-0.15
-0.2
-0.25
-0.3
acf
-0.35
pacf
-0.4
-0.45
Lag
43
acf
0.3
pacf
0.2
0.1
0
1
10
-0.1
-0.2
-0.3
-0.4
Lags
44
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1
10
-0.1
Lags
45
0.5
acf
pacf
0.4
0.3
0.2
0.1
0
1
10
-0.1
Lags
46
0
1
10
-0.1
-0.2
-0.3
-0.4
acf
pacf
-0.5
-0.6
Lags
47
acf
pacf
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1
10
Lags
48
0.6
acf
pacf
acf and pacf
0.4
0.2
0
1
10
-0.2
-0.4
Lags
49
Box and Jenkins (1970) were the first to approach the task of estimating an
ARMA model in a systematic manner. There are 3 steps to their approach:
1. Identification
2. Estimation
3. Model diagnostic checking
Step 1:
- Involves determining the order of the model.
- Use of graphical procedures
- A better procedure is now available
50
Reasons:
- variance of estimators is inversely proportional to the number of degrees of
freedom.
- models which are profligate might be inclined to fit to data specific features
This gives motivation for using information criteria, which embody 2 factors
- a term which is a function of the RSS
- some penalty for adding extra parameters
53
The information criteria vary according to how stiff the penalty term is.
The three most popular criteria are Akaikes (1974) information criterion
(AIC), Schwarzs (1978) Bayesian information criterion (SBIC), and the
Hannan-Quinn criterion (HQIC).
AIC = ln($ 2 ) + 2 k / T
k
SBIC = ln( 2 ) + ln T
T
2k
HQIC = ln( 2 ) +
ln(ln(T ))
T
where k = p + q + 1, T = sample size. So we min. IC s.t. p p, q q
SBIC embodies a stiffer penalty term than AIC.
Which IC should be preferred if they suggest different model orders?
SBIC is strongly consistent but (inefficient).
AIC is not consistent, and will typically pick bigger models.
54
ARIMA Models
Forecasting in Econometrics
Forecasting = prediction.
An important test of the adequacy of a model. e.g.
Forecasting tomorrows return on a particular share
Forecasting the price of a house given its characteristics
Forecasting the riskiness of a portfolio over the next year
Forecasting the volatility of bond returns
The distinction between the two types is somewhat blurred (e.g, VARs).
56
Say we have some data - e.g. monthly KSE-100 index returns for 120
months: 1990M1 1999M12. We could use all of it to build the model, or
keep some observations back:
A good test of the model since we have not used the information from
1999M1 onwards when we estimated the model parameters.
57
Models include:
simple unweighted averages
exponentially weighted averages
ARIMA models
Non-linear models e.g. threshold models, GARCH, etc.
59
i =1
j =1
f t , s = + i f t , s i + j ut + s j
where ft,s = yt+s , s 0; ut+s = 0, s > 0
= ut+s , s 0
60
ft, 1 = E(yt+1 | t )
=
=
ft, 2 = E(yt+2 | t )
=
=
ft, 3 = E(yt+3 | t )
=
=
ft, 4 = E(yt+4 | t )
ft, s = E(yt+s | t )
s4
62
1
MSE =
N
MAE is given by
1
MAE =
N
t =1
t =1
( yt + s f t , s ) 2
yt + s f t , s
1 N yt +s ft ,s
MAPE = 100
N t =1
yt +s
65
66
Illustrations of Box-Jenkins
methodology-I (Pak GDP forecasting)
Year
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
GDP
82085
86693
92737
98902
108259
115517
119831
128097
135972
148343
149900
153018
163262
174712
Year
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
GDP
180404
186479
191717
206746
218258
233345
247831
266572
284667
295977
321751
342224
362110
385416
Year
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
GDP
403948
422484
446005
480413
487782
509091
534861
570157
579865
600125
625223
700,000
600,000
500,000
400,000
300,000
200,000
100,000
0
1965
1970
1975
1980
1985
1990
1995
67
36,000
.10
32,000
28,000
.08
24,000
.06
20,000
16,000
.04
12,000
8,000
.02
4,000
0
1965
1970
1975
Var (d(GDP))=71614669
1980
1985
1990
1995
.00
1965
1970
1975
1980
1985
1990
1995
68
20,000
.04
10,000
.02
.00
-.02
-10,000
-.04
-20,000
-.06
-.08
-30,000
1965
1970
1975
1980
1985
1990
1995
1965
1970
1975
1980
1985
1990
1995
Var(d(gdp),2)=72503121 Var(d(log(gdp),2)=0.00074
69
70
71
72
ARIMA (p,d,q)
AIC
BIC
ARIMA (1,1,0)
-4.879
-4.792
ARIMA (4,1,0)
-4.932
-4.708
ARIMA (0,1,1)
-4.910
-4.824
ARIMA (0,1,4)
-5.370
-5.284
ARIMA (4,1,4)
-5.309
-5.174
ARIMA (5,1,5)
-5.249
-5.113
ARIMA (1,1,4)
-5.333
-5.202
ARIMA(0,1,4), is identified as the best models using the two model selection
criteria. Smaller the values of the selection criteria better is the in-sample fit
73
(1 L) yt = (1+1L +2 L2 +3L3 +4 L4 )t
(1 L) yt = (1 0.104L + 0.165L 0.201L + 0.913L )t
2
74
Model Diagnostics
We look at the correlogram of the estimated model.
The residuals appear to be white noise. P-values of Qstats of ARIMA(0,1,4) are smaller.
75
1970
1975
1980
GDP
1985
GDPF6
1990
1995
76
(Y Y )
h
700,000
600,000
500,000
400,000
300,000
200,000
100,000
0
1965
1970
1975
1980
GDP
1985
1990
1995
GDPF7
77
Using the two competing models the forecasts are generated as follows:
Year
Observed
ARIMA(0,1,4)
ARIMA(1,1,4)
1995
534861.0
536938.6
539376.9
1996
570157.0
569718.6
570955.6
1997
579865.0
584971.0
587198.2
1998
600125.0
615367.1
61828.2
1999
625223.0
648580.1
652246.0
RMSE
12715.77
15064.7
Note: Static forecast option for dynamic models (e.g ARIMA) in Eviews uses
actual values of lagged dependent variable, while dynamic forecast option uses
previously forecasted values of lagged dependent variable.
ARIMA(0,1,4) generates better forecasts as seen by smaller value of RMSE
78
79
LO G (P AS S E NGE R S )
700
6.50
600
6.25
500
6.00
400
5.75
5.50
300
5.25
200
5.00
100
4.75
0
49
50
51
52
53
54
55
56
57
58
59
60
61
4.50
49
50
51
52
53
54
55
56
57
58
59
60
80
61
D(LOG(PASSENGERS),1,12)
.3
.15
.2
.10
.1
.05
.0
.00
-.1
-.05
-.2
-.10
-.3
-.15
49
50
51
52
53
54
55
56
57
58
59
60
61
49
50
51
52
53
54
55
56
57
58
59
60
61
81
Lets have a look at ACF, PACF. ACF and PACF of d(log(Yt),1,12) indicate some significant
values at lag 1, and 12. We will do further work on d(log(Yt),1,12)
82
Models
AIC
BIC
MA(1) SMA(12)
-3.754
-3.689
AR(1) SMA(12)
-3.744
-3.678
AR(1) SAR(12)
-3.655
-3.585
-3.677
-3.609
AR(1) MA(1)
SAR(12)
-3.656
-3.562
AR(1) MA(1)
SMA(12)
-3.779
-3.691
83
84
85
Month
Forecast
700
1961.01
442.3
600
1961.02
429.45
500
1961.03
490.40
400
1961.04
484.82
1961.05
490.93
1961.06
560.17
1961.07
629.91
100
1961.08
626.91
1961.09
539.16
1961.10
474.11
1961.11
412.15
1961.12
462.14
300
200
49
50
51
52
53
54
PASSENGERS
55
56
57
58
59
60
61
PASSENGERSF
86