You are on page 1of 16

Cross-correlations and Predictability of

Stock Returns
D. OLSON
1
AND C. MOSSMAN
2
1
American University of Sharjah, United Arab Emirates
2
University of Manitoba, Canada
ABSTRACT
Studies have shown that small stock returns can be partially predicted by the
past returns of large stocks (cross-correlations), while a larger body of
literature has shown that macroeconomic variables can predict future stock
returns. This paper assesses the marginal contribution of cross-correlations
after controlling for predictability inherent in lagged macroeconomic
variables. Macroeconomic forecasting models generate trading rule prots
of up to 0431% per month, while the inclusion of cross-correlations
increases returns to 0516% per month. Such results suggest that cross-
correlations may serve as a proxy for omitted macroeconomic variables in
studies of stock market predictability. Macroeconomic variables are more
important than cross-correlations in forecasting small stock returns and
encompassing tests suggest that the small marginal contribution of cross-
correlations is not statistically signicant. Copyright #2001 John Wiley &
Sons, Ltd.
INTRODUCTION
Recent studies, such as Lo and MacKinlay (1990), have shown that small stock returns can be
predicted, in part, by the past returns of larger stocks. The cross-correlations are asymmetric in
the sense that returns to small stocks are correlated with lagged returns on large stocks, but
lagged returns for small stocks do not help predict returns to large stocks. The existence of this
leadlag relationship between large and small stocks raises questions about market eciency and
to date, two studies have examined whether trading rules can exploit the predictability inherent in
cross-correlations. McQueen, Pinegar, and Thorley (1996) devise a trading rule that yields
annualized abnormal returns of 68%, while Knez and Ready's (1996) non-parametric
forecasting technique generates trading rule prots of up to 21% per year. However, Knez and
Ready (1996) argue that the inclusion of realistic transaction costs eectively eliminates trading
rule prots.
In addition to the predictability arising from past stock returns, macroeconomic variables have
been shown to predict the time series of stock returns, while stock market fundamentals help
Copyright # 2001 John Wiley & Sons, Ltd.
Journal of Forecasting
J. Forecast. 20, 145160 (2001)
* Correspondence to: Dennis Olson, School of Business, PO Box 26666, American University of Sharjah, Sharjah,
United Arab Emirates.
explain the cross-section of stock returns. Connor (1995) categorizes models designed to capture
these sources of predictability as statistical factor models, macroeconomic factor models, and
fundamental factor models. For a pooled cross-sectional time series of US stock returns for
19851993, he nds that macroeconomic variables contain no marginal explanatory power when
added to either fundamental or statistical factor models. In contrast, Lo and MacKinlay (1990)
hypothesize that macroeconomic information impacts large companies rst and is transmitted
with a lag to smaller companies. If this hypothesis is correct, with the `right' set of
macroeconomic variables as predictors, the proper lag structure, and functional form, any
economically signicant prediction from cross-correlations should be eliminated. Following this
argument, one would expect macroeconomic variables to forecast small stock returns better than
statistical models involving cross-correlations, which is the opposite of Connor's (1995) ndings.
This study examines the relative importance of cross-correlations versus macroeconomic
variables in models that forecast returns for portfolios of US small stocks. Unlike previous
studies that examine predictability within-sample, comparisons between these two sources of
predictability are made using out-of-sample tests.
1
Following an approach developed by
Pesaran and Timmermann (1995), various models are tted within-sample and tested for one-
month-ahead out-of-sample predictability. The models are updated monthly using a rolling
120-month estimation window. Small stocks are purchased and held as long as one-month-
ahead portfolio returns are predicted to be positive, while the risk-free asset is held whenever
the forecast for excess stock returns (returns above the risk-free rate) is negative. Base-case
forecasting models are developed for both macroeconomic variables and cross-correlations.
Then, lagged large stock returns and macroeconomic variables are included in the same model
to determine the marginal contribution of each source of predictability. The models are judged
on the basis of directional forecast accuracy and trading rule prots before and after the
inclusion of trading costs.
LITERATURE REVIEW
Cross-correlations
Cross-correlations are perhaps the least researched of the many sources of predictability in stock
returns that are now well documented in the nance literature. Badrinath, Kale, and Noe (1995)
suggest that cross-autocorrelations between large and small stocks arise primarily from levels of
institutional ownership, rather than stock market value. Institutionally favoured stocks tend to
be larger than institutionally unfavoured rms, so that the leadlag eect in size portfolios may
be caused more by the level of institutional ownership than rm size. In contrast, McQueen et al.
(1996) document that observed leadlag relationships between large and small stocks are more
size related than the result of institutional ownership. They also discovered a directional
asymmetry in cross-correlations. Small stocks respond quickly to bad macroeconomic news, but
respond with a delay to common good news. Hence, the observed leadlag relationship applies
only to positive returns to large stocks.
In an attempt to understand why small stock returns can be predicted using cross correlations,
Boudoukh, Richardson, and Whitelaw (1994) categorize possible explanations into three
1
For example, Ferson and Korajczyk (1995) use factor models to determine which variables are most responsible for
within-sample returns predictability. Similarly, Connor's (1995) analysis of three types of factor models involves in-
sample comparisons.
146 D. Olson and C. Mossman
Copyright # 2001 John Wiley & Sons, Ltd. J. Forecast. 20, 145160 (2001)
groups loyalists, revisionists, and heretics according to their relationship with the ecient
market hypothesis. The loyalist group looks to data mismeasurement or specic institutional
features such as dierential bidask spreads to defend market eciency. Non-synchronous
trading is consistent with this explanation, but Lo and MacKinley (1990) argue that the
frequency of non-trading is not sucient to be the primary source of observed stock cross-
correlations.
Revisionist arguments expressed by Conrad, Gultekin, and Kaul (1991) suggest that
predictability arises from time-varying expected returns and does not violate market eciency.
More recently, Hameed (1997) has shown that predictability from cross-correlations can likely be
attributable to dierences in the level of time variation in expected returns. However, McQueen et
al. (1996) note that such an explanation does not indicate why returns to large stocks can not be
predicted in the same way. Also, their formal test for this theory fails to support the time-varying
risk premium argument.
The heretic explanations for predictability rest on over-reaction, under-reaction, noise trader
response, or feedback strategies that lead to a form of market ineciency. Over-reaction could
lead to contrarian prots, and under-reaction to protability of momentum strategies. For
example, Grinblatt, Titman, and Wermers (1995) argue that mutual fund managers follow each
other in buying winners, but make independent decisions about selling losers. Since less
information is available for small stocks, herding occurs once managers have observed a rather
imprecise signal, such as a positive return on large stocks in the previous period. This behaviour is
consistent with the directional asymmetry in cross-correlations, as identied by McQueen et al.
(1996).
Macroeconomic variables
Studies such as Fama and French (1989) demonstrate that macroeconomic variables
representing general business conditions can help predict the time series of stock returns.
Perhaps the most important of these variables are the levels and changes in interest rates.
Short-term rates (yields on T-bills or commercial paper), term spreads (yields on long-term
government bonds less short-term yields), and default spreads (yields on high-risk corporate
bonds versus low-risk corporate or government bonds) have been shown to have predictive
power in numerous studies. For example, Kairys (1993) shows that changes in commercial
paper rates help explain excess stock returns in the USA from the 1830s to the present. Lo and
MacKinlay (1990) show that large stocks respond to macroeconomic news in the same month
that the news is received, while the response of small stocks can take up to eight weeks (based
upon the signicance of lagged autocorrelations). Jegadeesh and Titman (1995) nd that stock
prices over-react to rm-specic information, but react with a delay to common factors or
macroeconomic information.
Pesaran and Timmermann (1995) show that information about industrial production,
ination, monetary growth, dividend yields, and earningsprice ratios improve upon the
predictability discovered by interest rate variables alone. Using a methodology that serves as
the base case for this study, they update the parameters and variables in their forecasting
model each month. Before paying transactions costs, this technique provides annual returns of
344375% above the return obtained from a buy-and-hold strategy during the years 1960
1992.
Cross-correlations and Predictability of Stock Returns 147
Copyright # 2001 John Wiley & Sons, Ltd. J. Forecast. 20, 145160 (2001)
DATA AND METHODOLOGY
The data set consists of 540 monthly observations for US stock returns and macroeconomic
variables for the years 19501994. Stock returns data are obtained from the Center for Research
in Security prices (CRSP). For each company, end-of-year market capitalization is used to sort
stocks into ve size portfolios where company stock returns are equally weighted within quintile
groups. Macroeconomic data on US interest rates, dividend yields, earningsprice ratios, and
ination rates are from Pinnacle Data Corporation. Data from the Federal Reserve include
various monthly interest rate series, as well as money supply, industrial production, the index of
leading indicators, and quarterly gross domestic product. The ination rate is measured by
changes in the consumer price index. Short-term interest rates are represented by yields on 90-day
Treasury bills and monthly rates for commercial paper. The dierence in yield between the 30-
year government bond and the risk-free rate on 90-day Treasury bills is termed the risk premium,
while the yield dierential between low- and high-grade corporate bonds is referred to as the
default premium. Finally, stock market fundamentals are reected by the dividend and earnings
yield for stocks in the Standard and Poor's 500 index.
The data set includes all of the macroeconomic variables used by Pesaran and Timmermann
(1995) to forecast stock returns, as well as a variety of variables used in other studies. The data
examined include current and past levels, as well as annual and monthly changes in each of the
variables. Although any type of lag structure may be possible, only lags of one and two months
and annual changes in the variables had signicant explanatory power. Also, higher-order
Almon lags and various complicated functional forms failed to improve upon the forecasting
abilities of simple lagged variables.
The data are initially divided into two time periods. The `in-sample' period consists of ten years
of monthly observations (120 observations) where the rst thirteen months of data are used only
for the purposes of dening annual changes in interest rates, dividend yields, and earningsprice
ratios. The `out-of-sample' period extends 34 years from February 1961 through December of
1994, for a total of 407 observations. Several dierent models are developed to forecast the
direction of one-month-ahead small stock portfolio excess returns, R
ST
= r
St
r
ft
, where R
St
represents an excess portfolio return, r
St
is the actual return to small stocks in period t, and r
ft
is
the risk-free rate in period t. Information available in periods t1 or earlier is used to forecast
current excess returns for the smallest quintile of stocks, R
St
. Cross-correlations are considered to
arise from excess returns for the largest quintile of stocks in the previous period, R
Lt1
, which
then aects small stock returns in the next period R
St
. For each of the models, the relevant stock
portfolio is purchased and held as long as one-month-ahead portfolio excess returns are
predicted to be positive. The risk-free asset is held if excess returns are forecasted to be negative.
As in other studies, we also consider purchasing the large stock portfolio whenever excess returns
are forecasted to be negative, but results are not as strong as for holding the risk-free assessment.
Following Pesaran and Timmermann (1995), macroeconomic variables (excluding cross-
correlations) are tted for the initial 10 year `in-sample' estimation period.
2
The variables are
selected using stepwise regression and cuto signicance level of 10%. This technique
approximately maximizes the within-sample adjusted R-squared statistic and it is used to
2
A 10-year period was used rather than the 6-year period used by Pesaran and Timmermann (1995). Previous work by
Brockman, Mossman, nad Olson (1997) using macroeconomic variable to forecast Canadian stock index returns found
that a 10-year estimation period led to more stable and accurate one-month-ahead forecasts of stock returns than a
rolling 5- or 6-year estimation period.
148 D. Olson and C. Mossman
Copyright # 2001 John Wiley & Sons, Ltd. J. Forecast. 20, 145160 (2001)
forecast one-month-ahead excess returns to the small stock portfolio. The estimating model is
then rolled forward one month by successively dropping the oldest observation and adding the
most recent monthly observation. The rolling regression window is always maintained at 120
months and the macroeconomic variables and model parameters are updated each month to
generate one-month-ahead forecasted returns for the entire 407-month out-of-sample period.
3
A series of ordinary least squares (OLS) regression models compare the forecasting ability of
macroeconomic variables versus cross-correlations. For each of the models, the forecasts are
obtained using information available at time t1 to forecast period t. Model A, which is the base-
case macroeconomic variable model is estimated as:
R
St
=

N
i=1
a
l
MV
i

l
(Model A)
This indicates that small stock returns in any period are a function of the N macroeconomic
variables (MV
i
) lagged one or more periods and an error term (
t
). Each of the lags of t1, t2,
and annual changes in the macroeconomic variables are each considered as separate variables.
The regression parameter a
i
takes on a non-zero value whenever MV
i
is included in the regression
model.
The variables selected in the `best' within-sample specications of Model A during any of the
407 estimation periods are listed below (i = 1,2 month lag):
TBY
ti
yield on 3-month T-bills lagged i periods (i = 1,2)
DYLD
ti
dividend yield on the S&P 500 lagged i months
EP
ti
earningsprice ratio for the S&P 500 lagged i months
CPAP
ti
interest rate on commercial paper lagged i months
RISKP
ti
risk premium lagged i months (yield on 30-year government bonds in
t1 minus TBY
t1
)
DEFP
ti
default premium lagged i months (yield on lower-grade BAA corporate
bond in period t1 minus the yield on hight-grade AAA corporate
bonds in t1)
INFL
ti
ination rate lagged i months
IP
ti
monthly change in industrial production lagged i months
JAN
t
Y SEPT
t
DEC
t
dummy variable set equal to one when the month to be forecasted is
January, September, or October
In addition, changes in the macroeconomic variables over the past year are signicant in many
periods. For example, the variable CTBY12 = TBY
t1
TBY
t13
is the change in yield on 90-
day T-bills over the past twelve months. It is signicant in about one-quarter of the estimation
periods. Among all the variables considered, only the January seasonal dummy is signicant in
all estimation periods. The September and October seasonal dummies are signicant in about
3
This continual updating of both parameters and model specication mitigates the lookback bias. It arises if researchers,
with the benet of hindsight, select a single set of factors that best `predicts' historical returns over an entire data set. As
noted by Pesaran and Timmermann (1995. p. 1202), the updating methodology using rolling regressions makes it possible
to `simulate investors' decisions in real time using publicly available information on a set of factors thought a priori to
have been relevant to forecasting stock returns'.
Cross-correlations and Predictability of Stock Returns 149
Copyright # 2001 John Wiley & Sons, Ltd. J. Forecast. 20, 145160 (2001)
half of the estimation periods. The remaining macroeconomic variables are not as
important individually, but some combination of them is signicant in every estimation
period. For example, one or more measures of short-term interest rates
(TBY
t1
Y TBY
t2
Y CPAP
t1
Y CPAP
t2
), or annual changes in these Treasury bill or
commercial paper rates (CTBY12 or CCPAP12) are signicant in every estimation period.
Similarly, some combination of lagged dividend and/or earnings yields (DYLD
ti
or EP
ti
) is
signicant in most periods, while some measure of long-term interest rates (RISKP
ti
or
DEFP
ti
) or annual changes in these variables enters each of the estimation models. In contrast,
various lags of the ination and industrial production variables enter only a few of the estimation
periods. Other variables not listed, such as changes in the index of leading indicators or changes
in the money supply, have predictive power if considered individually, but are subsumed by
various combinations of the listed variables.
We can also address the importance of updating the model specication and regression
parameters monthly, as done in Pesaran and Timmermann (1995). The functional form and
included variables selected do not vary signicantly from month to month. Nevertheless, it is
important to periodically review model specication because over a period of years the variables
selected change and the regression parameters also evolve over time.
Models BE add cross-correlation information to the macroeconomic variables. Model B adds
one period of lagged large stock returns to Model A as follows:
R
St
=

N
i=1
a
i
MV
i
b
1
R
Lt1

l
(Model B)
where b
i
is a regression coecient showing the signicance of the cross-correlation variable
R
Lt1
. Returns with lags of up to six periods were examined, but only the rst two lags proved
signicant in any of the estimation periods if macroeconomic variables were also included in the
model. Model B could be estimated by simply adding lagged large stock returns to the variables
already selected by Model A. However, then cross-correlations are only signicant in about half
of the estimation periods. An alternative way of modelling is to force cross-correlations into each
of the models and then select the macroeconomic variables using step-wise regression. Using this
technique, cross-correlations are signicant in all estimation periods; but as seen in the next
section, they do not capture as much information as various lagged macroeconomic variables.
Model C replaces lagged large stock returns by Model B by one period of lagged asymmetric
returns as shown below:
R
St
=

N
i=1
a
l
MV
i
b
2
R
Lt1
(up) b
3
R
Lt1
(down)
l
(Model C)
This type of model is attributable to McQueen et al. (1996). It allows cross-correlations to aect
small stock returns dierently, depending on whether last period's large stock excess returns were
positive [R
Lt1
(up)], or negative [R
Lt1
(down)], and b
2
and b
3
are regression coecients showing
the signicance of cross-correlations.
150 D. Olson and C. Mossman
Copyright # 2001 John Wiley & Sons, Ltd. J. Forecast. 20, 145160 (2001)
Model D adds a variable dened as lagged small stock minus large stock returns to the
macroeconomic variables (signicance measured by b
4
) in the following manner:
R
St
=

N
i=1
a
i
MV
i
b
4
(R
St1
R
Lt1
)
l
(Model D)
This variable used by Knez and Ready (1996), adds information about small stock's own
autocorrelations to the cross-correlation information contained in lagged large stock returns.
Finally, Model E incudes any signicant cross-correlation variables (CCV
j
) involving large
and small stocks lagged one or two periods as follows:
R
St
=

N
i=1
a
t
MV
i

M
j=1
b
j
CCV
j

t
(Model E)
During about one-fourth of the estimation periods there were no signicant cross-correlation
terms (all b
j
= 0) and Models A and E were identical. In other periods, usually one of the cross-
correlation variables was signicant, but the included variable often frequently changed between
R
Lt1
Y R
Lt2
Y (R
St1
R
Lt1
), R
Lt1
(up), and R
Lt1
(down). Stock return lags of longer than
two months were not signicant in any estimation periods.
Models FH correspond to Models BD, respectively, except that only cross-correlation
information is used to forecast small stock returns. Macroeconomic variables and seasonal
dummies are not included. These models provide information about the role of cross-correlations
versus macroeconomic variables in forecasting small stock returns.
RESULTS
The results for one-month-ahead out-of-sample forecasts of small stock returns are presented in
Table I. For the years 19611994, a buy-and-hold strategy that always forecasts an up market
would have been correct 5294% of the time. The abnormal return for a buy-and-hold strategy
would have been, by denition, 0% per month. The directional forecast accuracy for Models A
H exceeds that of a buy-and-hold strategy, and positive abnormal returns exist for all models,
given zero transaction costs.
4
The base case involving only macroeconomic variables, Model A, provides 5381% directional
accuracy. It holds stock for 6462% of the 407 months in the sample and only trades in 2015% of
all months. In the absence of transaction costs it provides abnormal returns of 0431% per
month, which is only slightly lower than for Models B, D, and E, which also include information
about cross-correlations. Model E allows one or more signicant lagged returns fromlarge stocks
or small stocks to enter the model. It provides the largest abnormal returns of any of the
models 0516% per month, in absence of transaction costs. Models BD force cross-
correlations into the estimating equation, even if the relationships are not signicant in all the in-
sample estimation periods. As a result, these models do little to improve upon the results from
4
Returns for large stocks, as other studies have shown, cannot be forecasted as well as small stock returns. A buy-and-
hold strategy would have forecasted market direction correctly for 5539% of the months in the sample, while the best
forecasting model has a directional forecasting accuracy of only 5515% and provides abnormal returns (before
transaction costs) of 0174% per month.
Cross-correlations and Predictability of Stock Returns 151
Copyright # 2001 John Wiley & Sons, Ltd. J. Forecast. 20, 145160 (2001)
macroeconomic variables alone. In fact, the asymmetric version of lagged returns in Model C
yields lower abnormal and Model B provides a lower directional forecasting accuracy than Model
A. Focusing on the theoretically preferred model, Model E, we note that addition of cross-
correlations adds a 147 percentage point improvement in directional forecast accuracy and a
0085 percentage point increase in abnormal returns (at zero trading cost) over Model A.
Models FH show the forecasting ability of cross-correlations alone. Directional forecast
accuracy is generally as high or higher than the results obtained for Models AE, but trading rule
prots are much lower than for the models involving macroeconomic variables. Such results are
consistent with previous studies showing that cross-correlations are a source of predictability, but
it appears that this information is not readily exploitable in terms of protability. For example,
monthly abnormal returns for zero trading cost are 0215% for Model F (cross-correlations),
versus 0431% for Model A (macroeconomic variables) and 0516% for Model E
(macroeconomic variables plus cross-correlations). Comparing these models, the marginal
contribution of macroeconomic variables is 0321% (05160215) versus 0085% (05160431)
for cross-correlations. While cross-correlations add little to protability generated by
macroeconomic variable models, macroeconomic variables can add signicantly to the abnormal
returns generated by cross-correlations alone. This evidence is indicative of the possibility that
cross-correlations may serve as a proxy for omitted lagged macroeconomic variables.
Encompassing tests are another way to judge the relative out-of-sample importance of
macroeconomic variables models versus statistical models with cross correlations. Donaldson
and Kamstra (1996, p. 57) note that a model, such as our Model A, should be preferred to
another model, such as Model F, if A explains what F cannot explain and F cannot explain what
A cannot explain. They show that a formal test for encompassing between any two models, such
as A and F, involves regressing the forecast error from Model A on the forecast from Model F to
see if Model F can explain what Model A cannot explain. Then the forecast error from Model F
152 D. Olson and C. Mossman
Copyright # 2001 John Wiley & Sons, Ltd. J. Forecast. 20, 145160 (2001)
is regressed on the forecast from Model A to see if Model A can explain what Model F cannot
explain.
5
For models A, E, and F, we nd that both models A and E encompass Model F (cross-
correlations alone) at the 5% signicance level. Model A is not encompassed by either Model E
or Model F and similarly Model E is not encompassed by either Model A or Model F. This test
conrms earlier results that macroeconomic variables are more important for out-of-sample
forecasts than cross-correlations. In fact, it suggests that the marginal contribution of cross-
correlations is not statistically important in distinguishing between Models A (macroeconomic
variables alone) and Model E (macroeconomic variables cross-correlations).
Adding trading costs of 025% per trade, or 05% roundturn, which Berkowitz, Logue, and
Noser (1988) found to be the average trading costs faced by large institutional investors reduces
abnormal returns for the best model, Model E, to 0457% per month. In the absence of trading
costs, the annualized abnormal return is 637%, which is nearly double the return found by
Pesaran and Timmermann (1995) for trading S&P 500 stocks. Although trading rule prots
appear to be substantial, small stocks have much larger transaction costs than the 05%
roundturn costs identied for trading large stocks. Knez and Ready (1996) suggest that
roundturn costs could approach 6%, or trading costs of 3% per trade in our framework. The last
column of Table I shows that none of the models provide positive abnormal returns given such
costs. However, Keim and Madhavan (1995) calculate that trading costs of small stocks for large
investors are 135%286% (27%536% roundturn costs). For 1% trading costs, Models AE
all provide positive abnormal returns. Monthly abnormal returns for the best models are 0230%
(2795% annualized) for Model A and 0280% (3412% annualized) for Model E. For 135%
trading costs, Models A and E provide monthly abnormal returns of 0159% and 0197%, which
decline to 0028% and 0044% for 2% trading costs. Whether such returns are exploitable
depends upon the proper identication of trading costs. Also, even if trading costs are as low as
135%, our results may not constitute a violation of market eciency. Trading costs were higher
and the technology needed to exploit this predictability may not have been available during the
earlier years of our data set.
ALTERNATIVE TRADING RULES AND FORECASTING TECHNIQUES
Table II considers the abnormal returns that could have been obtained using simple trading rule
strategies examined in previous studies. Model I assumes that an investor buys small stocks
whenever monthly returns to large stocks have been positive. The strategy provides a directional
accuracy of 5497% and an abnormal return of 0281% per month, in the absence of transaction
costs. Model J buys small stocks if their own most recent excess return has been positive. The
abnormal returns obtainable fromthis strategy are 0241%per month, or just slightly belowthat of
Model I. Model Krequires an investor to buy and hold small stocks if the excess return to either the
5
Chong and Hendry (1986) rst introduced encompassing tests to make model comparisons within-sample. Donaldson
and Kamstra (1996) provide a clear explanation of how to use the tests to judge forecasting performance out-of-sample.
Their notation applied to our models yields the rst regression equation: e
At
= a
0
a
1
f
Ft

t
, where e
At
is the forecast
error from Model A in period t, a
0
and a
1
are regression parameters, f
Ft
is the forecast from Model F, and
t
is an error
term. The second regression equation is: e
Ft
= b
0
b
1
f
At
Z
t
, where e
Ft
is the forecast error from Model F in period t, b
0
and b
1
are regression parameters, f
At
is the forecast from Model A, and Z
t
is an error term. If a
1
is signicant at the 5%
level and b
1
is not, then Model F encompasses Model A. If b
1
is signicant at the 5% level and a
1
is not, then Model A
encompasses Model F. If both a
1
and b
1
are signicant or if neither is signicant, then neither model encompasses the
other.
Cross-correlations and Predictability of Stock Returns 153
Copyright # 2001 John Wiley & Sons, Ltd. J. Forecast. 20, 145160 (2001)
small or large stock portfolio has been positive in the most recent month, while Model Lbuys small
stocks only if returns to both portfolios have been positive in the last month. Model Kprovides the
largest abnormal return among these simple strategies. The abnormal return of 0341%per month
(in the absence of transaction costs) is smaller than the abnormal returns generated by Models Aor
E, although the directional forecasting accuracy is similar to that obtained with any of the previous
models. Such dierences in abnormal returns illustrate the importance of including macro-
economic variables within a trading strategy designed to exploit predictability generated by cross-
correlations. However, the similarities in directional forecasting accuracy and the dierences in
abnormal returns between macroeconomic variable models and the cross-correlation models may
be due to forecasting techniques employed. Perhaps cross-correlations inuence future returns in a
more complex non-linear pattern than is captured by OLS regression.
Several previous studies have assumed that an investor would hold the large stock portfolio,
instead of T-bills, whenever the forecasted return for small stocks is negative. This doubles the
number of stock trades made, and would add at least 025% trading costs to the costs of buying
and selling small stocks. Even in the zero transaction cost environment, this strategy provides
smaller abnormal returns for Models AH than in the case where the risk-free asset is held in
forecasted down months. For the simple trading rules embodied in Models IL, abnormal
returns in the absence of transaction costs are 002 to 012 percentage points higher per month
than those presented in Table II. For example, Model I provides abnormal monthly returns of
0366% for this case, versus 0281% in Table II. Once trading costs of even 025% are included,
abnormal returns are lower than in the case where the investor holds T-bills during forecasted
down markets.
Non-parametric techniques, such as those employed by Knez and Ready (1996), may improve
upon the results from ordinary least squares forecasts if the relationship between returns and
fundamental variables or cross-correlations is non-linear in nature.
6
Perhaps the most popular of
6
An Almon lag structure could be used to introduce some degree of non-linearity into the models without going to the
added complexity of non-parametric estimation. Almon lags of up to twelve months and rst-, second-, and third-degree
polynomials for the lag structure were examined, but results are not reported. The Almon lag models forecasted no
better, and generally performed worse out of sample, than the models estimated by ordinary least squares (OLS).
Similarly, a two-equation vector autoregression (VAR) model that uses the same macroeconomic variables to estimate
returns for small and large stocks simultaneously failed to forecast as well as the OLS models for the 19611994 period.
154 D. Olson and C. Mossman
Copyright # 2001 John Wiley & Sons, Ltd. J. Forecast. 20, 145160 (2001)
the non-parametric regression (NPR) techniques is the kernel density estimator. It is used in this
paper as an alternative to OLS estimation and forecasting techniques. Further description of
non-parametric estimation is provided in the Appendix.
The rolling regression window of 120 observations used for OLS estimation is maintained for
non-parametric regression estimation (NPR). For each model, the corresponding OLS variable
set was used to estimate each month's model. This was necessary because there is no step-wise
NPRestimator. Table III provides evidence about the one-step-ahead forecasting performance of
some representative non-parametric forecasting models for the period 19611994. Model A,
which includes only macroeconomic variables, provides abnormal returns of 0134% per month
and forecasts the sign of one-month-ahead returns in 5479% of all cases. Model B, which adds
lagged large stock returns gives a directional forecasting accuracy of 5528% and monthly
abnormal returns of 0294%. Model D, which adds lagged small and large stock returns to the
macroeconomic variables, proves a directional accuracy of 5503% and abnormal monthly
returns of 0302% per month. Since non-parametric techniques are being used, the asymmetric
version of the forecasting models does not oer any theoretical improvement over Model B.
(Actual regressions conrmed that in practice, Model C also generates lower abnormal returns
and has a lower directional accuracy than Model B.) The contribution of cross-correlations in
moving from Model A to Models B or D is 0160 to 0168 percentage points per month, which is
somewhat larger than the previous results from OLS forecasts. However, we note that NPR
abnormal returns are consistently lower than those generated by comparable OLS models, even
though again directional forecast accuracy is similar. The apparent inferiority of non-parametric
techniques may be due to the diculties, in practice, with estimating various complicated models
involving macroeconomic variables. Problems may also arise from the well-known over-tting
problem that plagues non-parametric and neural network forecasting.
7
Other studies have found
7
Backpropagation neural networks were also constructed to forecast for Models AH with results generally better than
for the non-parametric models, but worse than for the OLS forecasting models each of the models. For example, monthly
abnormal returns (in the absence of trading costs) over the 19611994 period are 0176% for Model A and 0407 for
Model D for neural networks. Comparable results are 0134% and 0302% for non-parametric regression forecasts and
0431% and 0447% for OLS forecasting techniques.
Cross-correlations and Predictability of Stock Returns 155
Copyright # 2001 John Wiley & Sons, Ltd. J. Forecast. 20, 145160 (2001)
non-linearities in stock return series, but our results suggest that the non-linearities are not easily
exploitable, perhaps because the relationships between variables changes over time.
VARIABILITY OF FORECAST ACCURACY BETWEEN TIME PERIODS
We now explore the accuracy and protability of the forecasting models over various subperiods
of our data set. Forecasting results for the years 19831994 are shown in Table IV. This period
corresponds to the period of analysis used by McQueen et al. (1996). Most of the models perform
better over this subperiod than over the entire 19611994 period. Model E provides the largest
abnormal returns 0689% per month in the absence of transaction costs, while Model C gives
the best directional forecasting accuracy 5903%. Cross-correlations add very little to the
predictability achieved by macroeconomic variables, but macroeconomic variables seem to
improve upon predictability from cross-correlations. In this period non-parametric techniques
perform as well as OLS forecasts and the NPR estimation of Model D actually generates slightly
larger trading rule abnormal returns than the OLS model.
McQueen et al. (1996) found abnormal returns of 055% per month over the 19831994
period using a strategy that is represented by their Model I. In contrast, our Model I gives
abnormal returns of 0397% per month by holding T-bills during forecasted down months
(instead of the large stock portfolio). We can replicate McQueen et al.'s (1996) results, but for
trading costs of 05% or higher it is better to adopt our Model I. Also, the strategy of
switching between large and small stocks does not work as well as holding T-bills during the
earlier 19611982 subperiod. Comparison of Models AE with Models H and I shows the
importance of macroeconomic variables relative to cross-correlations. The small marginal
contribution of cross-correlations (0086%) is virtually identical to the contribution over the
entire data set (0085%).
156 D. Olson and C. Mossman
Copyright # 2001 John Wiley & Sons, Ltd. J. Forecast. 20, 145160 (2001)
Table V presents results of selected forecasting models for the 19881992 period analysed by
Knez and Ready (1996). Directional accuracy and abnormal returns are generally higher for this
period than for the whole data set or in other subperiods. Model A has a directional accuracy of
5333% and abnormal returns of 0984% before transaction costs. Model E again gives the best
results with a 6333% directional accuracy and an abnormal monthly return of 1230%. Knez
and Ready (1996) used a non-parametric model similar to Model H to forecast weekly changes in
stock prices during this same period. Their strategy generated annual abnormal returns (before
transaction costs) that ranged from 203% down to 528% depending upon the switching point
and whether the trade occurred at the last trading price or the estimated execution price. In
contrast, Models A and E, which use monthly data, provide annualized abnormal returns of
1247% and 1580%. Model H, which most closely approximates the Knez and Ready
methodology, provides annualized abnormal returns of 395%.
Given the large dierences in forecasting performance over the full period versus the
subperiods considered in other studies, we also compare forecasting performance of Models A
and E over 5-year subperiods. Results for other models follow a similar pattern and are not
presented. In the absence of transaction costs, mean abnormal monthly returns for Model A for
19611965, 19661970, 19711975, 19761980, 19811985, 19861990, and 19911994 are
0384%, 0770%, 1420%, 0445%, 0106%, 1400%, and 0065%. For Model E abnormal
returns are 0523%, 0652%, 1563%, 0045%, 0100%, 1624%, and 0039% for the same
years. The models have the best forecasting accuracy and largest abnormal returns during the
years with the greatest number of down movements in stock prices. They do not perform as well
when markets are trending upward. Perhaps most importantly, the results show considerable
Cross-correlations and Predictability of Stock Returns 157
Copyright # 2001 John Wiley & Sons, Ltd. J. Forecast. 20, 145160 (2001)
variability in protability of macroeconomic variable and cross-correlation trading strategies
over time. In fact the entire success of the trading rules rests upon returns in three of the
subperiods: 19661970, 19711975, and 19861990. In other periods the abnormal returns are
either negative or negligible. This variability of success over time is a type of risk that along
with transaction costs may explain why abnormal returns persist over 34 years of out-of-sample
tests.
SUMMARY AND CONCLUSIONS
This paper has examined two sources of predictability in small stock returns: macroeconomic
variables and cross-correlations. They both yield similar out-of-sample directional forecast
accuracy, macroeconomic variables are preferred using encompassing tests and on the basis of
trading rule prots. Cross-correlations marginally improve upon the forecast accuracy and
trading rule prots generated by models using macroeconomic variables alone, while adding
macroeconomic variables to cross-correlation models substantially increases abnormal returns.
Cross-correlations seem to serve as a proxy for omitted lagged macroeconomic variables in
studies of small stock predictability. The reason they may be signicant in some periods and add
marginally to trading rule prots is because the macroeconomic variables included in the best
model are generally stable from month to month, but they do change slowly over time. During
periods when the macroeconomic variables included are changing, cross-correlations may pick
up changing market conditions faster than lagged macroeconomic variables.
APPENDIX
To illustrate the use of non-parametric techniques and the kernel density estimator, we denote the
time series of returns to small stocks by Y
t
, where Y
t
= y
1
Y y
2
Y y
3
Y y
n
for t = 1Y 2Y 3Y F F F n. A group
of fundamental variables and cross-correlation terms, denoted by the vector X
t
, helps to explain
Y
t
. The return series may be represented by Y
t
= m(X
t
)
t
, where m(X
t
) is an arbitrary xed
function of unknown form, and
t
is the error term. When m(X
t
) is linear, Y
t
can be estimated by
linear regression, but if m(X
t
) is of an unknown non-linear form, Y
t
should be estimated by non-
parametric regression techniques. Assuming sucient smoothness, the time series observations
of X
t
near a point of evaluation x
0
should be close to x
0
and the corresponding values for Y
t
should be near m(x
0
). All observations within some neighbourhood of x
0
(denoted by h) are
assumed to inuence Y
t
, but values of X
t
closer to x
0
are assumed to have greater inuence. m(x
0
)
can then be estimated by a weighted average of the values of Y
t
, where the weights depend upon
the distances of any X
t
from x
0
.
A common way to assign weights within the neighbourhood about x
0
is to use a non-
parametric kernel density function, K
h
(x), which is probability density function with nite mean
and variance satisfying the conditions that K
h
(x) 50 and

o
o
Kh(x)dx = 1. The parameter, h,
which can be estimated from the data or selected a priori, controls the size of the neighbourhood
about x
0
. It is also called the bandwidth, window width, or the smoothing parameter since it
controls the smoothness of the kernel and essentially determines which observations are used in
158 D. Olson and C. Mossman
Copyright # 2001 John Wiley & Sons, Ltd. J. Forecast. 20, 145160 (2001)
local averaging. The statistics literature has shown that many dierent kernel functions can be
used for K
h
(x) and that results are not very sensitive to choice of kernels. Since the Gaussian
kernel is the most popular and since it is readily available in the GAUSSX statistical package, it is
used in this paper. The NadarayaWatson kernel estimator m(x) of m(x), or the conditional
mean of Y
t
, is given by:
m(x) =

n
j=1
K
h
(x
0
X
t
)Yt

n
j=1
K
h
(x
0
X
t
)
Assuming a Gaussian kernel, the weighting scheme is based upon Euclidean distance away from
x
0
and the kernel simplies to:
K
h
(x) =
1
h

2p
_ e
x
2
a2h
2
REFERENCES
Badrinath S, Kale J, Noe T. 1995. Of shepherds, sheep, and the cross-autocorrelations in equity returns.
Review of Financial Studies 8: 401430.
Berkowitz S, Logue D, Noser E Jr. 1988. The total costs of transactions on the NYSE. Journal of Finance 43:
97112.
Boudoukh J, Richardson M, Whitelaw R. 1994. A tale of three schools: Insights on autocorrelations of
short-horizon stock returns. Review of Financial Studies 7: 539573.
Brockman P, Mossman C, Olson D. 1997. Predictability of Canadian stock returns and choice of equity
portfolios. Working paper, University of Manitoba.
Chong Y, Hendry D. 1986. Econometric evaluation of linear macroeconomic models. Review of Economic
Studies 53: 671690.
Connor G. 1995. The three types of factor models: A comparison of their explanatory power. Financial
Analysts Journal 51: 4246.
Conrad J, Gultekin M, Kaul G. 1991. Asymmetric predictability of conditional variances. Review of
Financial Studies 4: 597622.
Donaldson R, Kaamstra M. 1996. Forecast combining with neural networks. Journal of Forecasting 15: 49
61.
Fama E, French K. 1989. Business conditions and expected returns on stocks and bonds. Journal of
Financial Economics 25: 2349.
Ferson W, Korajczyk R. 1995. Do arbitrage pricing models explain the predictability of stock returns?
Journal of Business 68: 309349.
Grinblatt M, Titman S, Wermers R. 1995. Momentum investment strategies, portfolio performance, and
herding: A study of mutual fund behavior. American Economic Review 85: 10881105.
Hameed A. 1997. Time-varying factors and cross-autocorrelations in short-horizon stock returns. Journal of
Financial Research 20: 435458.
Jegadeesh N, Titman S. 1995. Overreaction, delayed reaction, and contrarian prots. Review of Financial
Studies 8: 972993.
Kairys J. 1993. Predicting sign changes in the equity risk premium using commercial paper rates. Journal of
Portfolio Management 19: 4151.
Keim D, Madhavan A. 1995. Execution costs and investment performance: An empirical analysis of
institutional equity trades. Working paper 9-95, Wharton School, University of Pennsylvania.
Knez P, Ready M. 1996. Estimating the prots from trading strategies. Review of Financial Studies 9: 1121
1163.
Lo A, MacKinlay C. 1990. When are contrarian prots due to stock market overreaction? Review of
Financial Studies 3: 175205.
Cross-correlations and Predictability of Stock Returns 159
Copyright # 2001 John Wiley & Sons, Ltd. J. Forecast. 20, 145160 (2001)
McQueen G, Pinegar M, Thorley S. 1996. Delayed reaction to good news and the cross-autocorrelation of
portfolio returns. Journal of Fiance 51: 889919.
Pesaran M, Timmermann A. 1995. Predictability of stock returns: Robustness and economic signicance.
Journal of Finance 50: 12011228.
160 D. Olson and C. Mossman
Copyright # 2001 John Wiley & Sons, Ltd. J. Forecast. 20, 145160 (2001)

You might also like