You are on page 1of 190

Asset Pricing

ECONM2035
Richard Payne
School of Economics, Finance and Management,
University of Bristol
c Richard G. Payne, September 25, 2008
1
2
Contents
1 Introduction 13
2 Review of Statistics 15
2.1 Basic probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.1 Example: rolling dice . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.2 Further conditions on probabilities . . . . . . . . . . . . . . . 16
2.1.3 Random variables . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.1.4 CDFs and PDFs . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.5 Discrete random variables: the probability mass function . . . 17
2.1.6 Continuous random variables: the probability density function 18
2.1.7 Features of PMFs and PDFs . . . . . . . . . . . . . . . . . . . 18
2.2 Basic Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2.1 Expectations and functions of RVs . . . . . . . . . . . . . . . 19
2.2.2 The Variance and Standard Deviation . . . . . . . . . . . . . 20
2.2.3 Manipulating Variances . . . . . . . . . . . . . . . . . . . . . 20
3
2.2.4 Skewness and Kurtosis . . . . . . . . . . . . . . . . . . . . . . 20
2.3 Multivariate probability and statistics . . . . . . . . . . . . . . . . . . 22
2.3.1 The joint distribution and marginal distributions . . . . . . . 23
2.3.2 The conditional density . . . . . . . . . . . . . . . . . . . . . . 23
2.3.3 Moments and functions of two random variables . . . . . . . . 24
2.3.4 Properties of Covariance . . . . . . . . . . . . . . . . . . . . . 24
2.3.5 Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3.6 Independence and un-correlatedness . . . . . . . . . . . . . . . 25
2.3.7 Variances of combinations of RVs . . . . . . . . . . . . . . . . 26
2.4 Sample moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4.1 The Sample Mean . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.2 The distribution of the sample mean . . . . . . . . . . . . . . 27
2.4.3 The Central Limit Theorem . . . . . . . . . . . . . . . . . . . 28
2.4.4 Other sample moments . . . . . . . . . . . . . . . . . . . . . . 28
2.5 Some univariate probability distributions . . . . . . . . . . . . . . . . 29
2.5.1 The
2
distribution . . . . . . . . . . . . . . . . . . . . . . . . 30
2.5.2 The Student-t distribution . . . . . . . . . . . . . . . . . . . . 30
2.5.3 The Exponential Distribution . . . . . . . . . . . . . . . . . . 31
3 Present values 33
3.1 Present values and the time-value of money . . . . . . . . . . . . . . 33
3.1.1 Simple interest . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.1.2 Compound interest . . . . . . . . . . . . . . . . . . . . . . . . 34
3.1.3 Compounding for sub-periods . . . . . . . . . . . . . . . . . . 34
3.1.4 Continuous compounding . . . . . . . . . . . . . . . . . . . . 36
3.1.5 Compounding and future values . . . . . . . . . . . . . . . . . 36
4
3.2 Discounting and the time value of money . . . . . . . . . . . . . . . . 37
3.2.1 Present values . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2.2 Discounting example . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.3 Present values of streams of payments . . . . . . . . . . . . . 39
3.2.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2.5 Annuity example . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2.6 Present values and uncertainty . . . . . . . . . . . . . . . . . . 41
3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 Fixed Income Securities and Markets . . . . . . . . . . . . . . . . . . 42
3.4.1 Size of Bond Markets . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.2 Types of bond . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4.3 Zero coupon (discount) bonds . . . . . . . . . . . . . . . . . . 44
3.4.4 Coupon bonds . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.4.5 Coupon bonds as portfolios of zero-coupon bonds . . . . . . . 45
3.4.6 Slightly more exotic bonds . . . . . . . . . . . . . . . . . . . . 46
3.4.7 Bonds and Geography . . . . . . . . . . . . . . . . . . . . . . 47
3.4.8 Bonds with option-like features . . . . . . . . . . . . . . . . . 47
3.4.9 Varieties of Government debt . . . . . . . . . . . . . . . . . . 48
4 Fixed income securities 49
4.1 Bonds: valuation and the term-structure . . . . . . . . . . . . . . . . 49
4.1.1 The term-structure of interest rates . . . . . . . . . . . . . . . 50
4.1.2 Valuation of zero-coupon bonds . . . . . . . . . . . . . . . . . 51
4.1.3 Valuation of coupon bonds: annual coupon payment . . . . . . 52
4.1.4 Valuation of coupon bonds: semi-annual coupon payment . . . 53
4.1.5 Zero-coupon bonds and the term structure . . . . . . . . . . . 53
5
4.1.6 Valuing a bond by replication . . . . . . . . . . . . . . . . . . 54
4.1.7 Example: coupon bond valuation . . . . . . . . . . . . . . . . 55
4.1.8 Coupon bonds and the term-structure . . . . . . . . . . . . . 56
4.1.9 Example: bootstrapping the term structure . . . . . . . . . . 56
4.2 Yield to maturity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2.1 Features of YTMs . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.2.2 Example; YTM calculation . . . . . . . . . . . . . . . . . . . . 58
4.2.3 YTMs and coupon rates . . . . . . . . . . . . . . . . . . . . . 59
4.2.4 Yields and returns . . . . . . . . . . . . . . . . . . . . . . . . 59
4.3 Corporate Bonds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.3.1 Bond ratings and rating agencies . . . . . . . . . . . . . . . . 61
4.3.2 Example: Moodys bond ratings . . . . . . . . . . . . . . . . . 61
4.3.3 Bond ratings and probability of default . . . . . . . . . . . . . 62
4.3.4 Default risk, bond prices and yields . . . . . . . . . . . . . . . 62
4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5 Introduction to Equities and Risk 65
5.1 Equity markets: basic facts and features . . . . . . . . . . . . . . . . 65
5.1.1 Variations on the simple equity security . . . . . . . . . . . . . 66
5.1.2 Stock Market listings . . . . . . . . . . . . . . . . . . . . . . . 66
5.1.3 Equity indices . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.1.4 Equity Market Movements: 01/01/1995 - present . . . . . . . 67
5.1.5 Summary: key features of equities . . . . . . . . . . . . . . . . 67
5.2 Choice, uncertainty and risk . . . . . . . . . . . . . . . . . . . . . . . 68
5.3 Choice under certainty . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.3.1 Axioms of choice under certainty . . . . . . . . . . . . . . . . 69
6
5.3.2 Preferences and utility functions . . . . . . . . . . . . . . . . . 69
5.4 Choice under uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.4.1 Axioms of choice under uncertainty: . . . . . . . . . . . . . . . 71
5.4.2 The Expected Utility Theorem . . . . . . . . . . . . . . . . . 71
5.5 Preferences towards risk . . . . . . . . . . . . . . . . . . . . . . . . . 72
5.5.1 Risk-aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.5.2 Risk-lovers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.5.3 Risk-neutrality . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.5.4 Real people . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.6 Measuring risk-aversion . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.6.1 The Coecient of Absolute Risk Aversion . . . . . . . . . . . 75
5.6.2 The Coecient of Relative Risk Aversion . . . . . . . . . . . . 76
5.6.3 Absolute versus relative risk aversion . . . . . . . . . . . . . . 76
5.7 A selection of widely-used utility functions . . . . . . . . . . . . . . . 77
5.8 Stochastic dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.8.1 First-order stochastic dominance . . . . . . . . . . . . . . . . 78
5.8.2 Second-order stochastic dominance . . . . . . . . . . . . . . . 79
5.8.3 Mean-preserving spread . . . . . . . . . . . . . . . . . . . . . 79
6 Portfolio Theory 83
6.1 Statistical facts regarding stock returns . . . . . . . . . . . . . . . . . 84
6.1.1 Stocks versus bonds . . . . . . . . . . . . . . . . . . . . . . . . 84
6.1.2 Single stocks versus stock indices . . . . . . . . . . . . . . . . 85
6.1.3 Stock return correlations . . . . . . . . . . . . . . . . . . . . . 85
6.1.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.2 The Basic Mean-Variance Problem . . . . . . . . . . . . . . . . . . . 86
7
6.2.1 What are portfolio weights? . . . . . . . . . . . . . . . . . . . 87
6.2.2 Portfolio Characteristics . . . . . . . . . . . . . . . . . . . . . 87
6.2.3 Preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.3 Frontier portfolios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6.3.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.4 The Portfolio Frontier . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.4.1 Diversication . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.5 The Ecient Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.6 Two-fund separation . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.7 A Risk-less Asset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.7.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.7.2 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7 The CAPM 101
7.1 The CAPM derivation . . . . . . . . . . . . . . . . . . . . . . . . . . 101
7.1.1 Some further assumptions . . . . . . . . . . . . . . . . . . . . 103
7.1.2 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
7.1.3 The CAPM equation . . . . . . . . . . . . . . . . . . . . . . . 104
7.2 Understanding the CAPM equation . . . . . . . . . . . . . . . . . . . 104
7.2.1 and Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
7.2.2 More stu about . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2.3 Systematic risk and idiosyncratic risk . . . . . . . . . . . . . . 105
7.2.4 The Security Market Line . . . . . . . . . . . . . . . . . . . . 106
7.2.5 Uses of the CAPM . . . . . . . . . . . . . . . . . . . . . . . . 106
7.3 Testing the CAPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
8
7.3.1 Estimating . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
7.3.2 Cross-sectional tests . . . . . . . . . . . . . . . . . . . . . . . 109
7.3.3 Problems and extensions . . . . . . . . . . . . . . . . . . . . . 110
7.3.4 Noise in stock returns and portfolio formation . . . . . . . . . 110
7.3.5 The Roll Critique . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.3.6 Empirical evidence . . . . . . . . . . . . . . . . . . . . . . . . 111
7.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
8 Mulltifactor models and the APT 113
8.1 Absence of arbitrage: review . . . . . . . . . . . . . . . . . . . . . . . 114
8.1.1 Building the arbitrage . . . . . . . . . . . . . . . . . . . . . . 114
8.1.2 Absence of arbitrage . . . . . . . . . . . . . . . . . . . . . . . 115
8.1.3 Preference restrictions . . . . . . . . . . . . . . . . . . . . . . 115
8.1.4 Formal denition of arbitrage portfolios . . . . . . . . . . . . . 115
8.2 Factor models for returns . . . . . . . . . . . . . . . . . . . . . . . . . 116
8.2.1 Restrictions on the factor model . . . . . . . . . . . . . . . . . 116
8.2.2 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8.3 The APT: a simple derivation . . . . . . . . . . . . . . . . . . . . . . 117
8.3.1 Factor models and portfolio characteristics . . . . . . . . . . . 118
8.3.2 A two-factor, three asset example . . . . . . . . . . . . . . . . 118
8.3.3 Factor replicating portfolios . . . . . . . . . . . . . . . . . . . 118
8.3.4 Example: deriving replicating portfolio weights . . . . . . . . 119
8.3.5 Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
8.3.6 Calculating the replicating weights . . . . . . . . . . . . . . . 120
8.3.7 Finally ...... . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
8.3.8 The APT equation . . . . . . . . . . . . . . . . . . . . . . . . 121
9
8.3.9 Example: the APT equation . . . . . . . . . . . . . . . . . . . 121
8.4 The Arbitrage Pricing Theory: formal derivation . . . . . . . . . . . . 122
8.4.1 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
8.4.2 The CAPM and the APT . . . . . . . . . . . . . . . . . . . . 124
8.5 Empirical tests of the APT . . . . . . . . . . . . . . . . . . . . . . . . 125
8.5.1 Economic or Characteristic-based factor models . . . . . . . . 125
8.5.2 Statistical Factor models . . . . . . . . . . . . . . . . . . . . . 126
8.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
9 Market eciency 129
9.1 Informational eciency: denition . . . . . . . . . . . . . . . . . . . . 129
9.1.1 Fleshing out the denitions . . . . . . . . . . . . . . . . . . . 130
9.1.2 Information versus operational eciency . . . . . . . . . . . . 131
9.2 Information sets and varieties of eciency . . . . . . . . . . . . . . . 131
9.3 Risk-adjustment and testing eciency . . . . . . . . . . . . . . . . . . 132
9.4 Eciency and random returns . . . . . . . . . . . . . . . . . . . . . . 133
9.5 Eciency and statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 134
9.5.1 Abnormal returns and fair games . . . . . . . . . . . . . . . . 135
9.6 Testing WFE: return predictability . . . . . . . . . . . . . . . . . . . 135
9.6.1 Autocorrelation analysis . . . . . . . . . . . . . . . . . . . . . 136
9.6.2 Autocorrelation evidence . . . . . . . . . . . . . . . . . . . . . 136
9.6.3 Calendar anomalies . . . . . . . . . . . . . . . . . . . . . . . . 137
9.6.4 Explaining calendar anomalies ... . . . . . . . . . . . . . . . . 137
9.6.5 Technical trading rules . . . . . . . . . . . . . . . . . . . . . . 138
9.6.6 Empirical evidence on technical trading rules . . . . . . . . . . 138
9.7 Tests of SSFE: event studies . . . . . . . . . . . . . . . . . . . . . . . 139
10
9.7.1 Interpreting an event study plot . . . . . . . . . . . . . . . . . 140
9.7.2 Results from event studies . . . . . . . . . . . . . . . . . . . . 142
9.8 Tests of strong-form eciency . . . . . . . . . . . . . . . . . . . . . . 142
9.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
10 Derivatives: instruments and pricing 145
10.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
10.1.1 Vanilla derivatives . . . . . . . . . . . . . . . . . . . . . . . . 146
10.1.2 Uses for Derivatives . . . . . . . . . . . . . . . . . . . . . . . . 147
10.2 Forwards and futures contracts . . . . . . . . . . . . . . . . . . . . . 147
10.2.1 Example: forward contract . . . . . . . . . . . . . . . . . . . . 149
10.3 Futures contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
10.3.1 Marking to market: example . . . . . . . . . . . . . . . . . . . 151
10.4 Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
10.4.1 Option payos . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
10.4.2 Long Call Payo . . . . . . . . . . . . . . . . . . . . . . . . . 155
10.4.3 Long put payo . . . . . . . . . . . . . . . . . . . . . . . . . . 156
10.4.4 Options, leverage and risk . . . . . . . . . . . . . . . . . . . . 158
10.5 Option combinations: payos and uses . . . . . . . . . . . . . . . . . 159
10.5.1 Bull and bear spreads . . . . . . . . . . . . . . . . . . . . . . 160
10.5.2 Bear spreads . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
10.5.3 The Buttery spread . . . . . . . . . . . . . . . . . . . . . . . 162
10.5.4 Example; Payo derivation for call Buttery Spread . . . . . . 164
10.5.5 The Straddle . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
10.5.6 Strips and Straps . . . . . . . . . . . . . . . . . . . . . . . . . 166
10.6 Swaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
11
10.6.1 Vanilla Interest rate swap . . . . . . . . . . . . . . . . . . . . 168
10.6.2 What use is a swap? . . . . . . . . . . . . . . . . . . . . . . . 169
10.6.3 Currency swaps . . . . . . . . . . . . . . . . . . . . . . . . . . 170
10.6.4 Intermediaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
10.7 Pricing derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
10.7.1 Pricing by absence of arbitrage . . . . . . . . . . . . . . . . . 172
10.7.2 Implications of absence of arbitrage . . . . . . . . . . . . . . . 173
10.8 Forwards and futures prices . . . . . . . . . . . . . . . . . . . . . . . 173
10.8.1 Example; forward pricing . . . . . . . . . . . . . . . . . . . . . 174
10.8.2 Forward prices for dividend paying assets . . . . . . . . . . . . 175
10.9 Binomial option pricing . . . . . . . . . . . . . . . . . . . . . . . . . . 175
10.9.1 Set-up of the replicating portfolio . . . . . . . . . . . . . . . . 177
10.9.2 Replication versus risk-neutral pricing . . . . . . . . . . . . . 179
10.9.3 Example; binomial call pricing . . . . . . . . . . . . . . . . . . 180
10.10Black-Scholes option pricing . . . . . . . . . . . . . . . . . . . . . . . 181
10.10.1Continuous compounding and discounting . . . . . . . . . . . 182
10.10.2The Black-Scholes formula . . . . . . . . . . . . . . . . . . . . 182
10.10.3Measuring volatility . . . . . . . . . . . . . . . . . . . . . . . . 183
10.10.4Example; BS call pricing . . . . . . . . . . . . . . . . . . . . . 184
10.11Arbitrage and option price relationships . . . . . . . . . . . . . . . . 185
10.11.1Put-call partity . . . . . . . . . . . . . . . . . . . . . . . . . . 185
10.11.2Bounding a call price with the underlying . . . . . . . . . . . 186
10.12Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
12
CHAPTER 1
Introduction
These notes provide the basis for an advanced undergraduate or foundation level MSc
course on asset pricing. They are designed for use alongside an appropriate text as
many details and pieces of background information have been omitted.
The presentation begins with a review of some useful concepts from statistics and
mathematics. Then, the treatment passes to the valuation of xed income securities.
Equity markets and equity pricing models are examined next, before a chapter on
dening and testing for securities market eciency. Finally, a treatment of the basics
of derivatives pricing wraps up the presentation.
13
14
CHAPTER 2
Review of Statistics
In this chapter we briey re-introduce those concepts from probability theory, statis-
tics and mathematics which will be crucial to the analysis that follows. More ad-
vanced concepts will be introduced as and when required.
2.1 Basic probability
Well start o by considering an experiment which has a set of potential outcomes, S,
which well call the sample space. Associated with each element of S is a probability
the likelihood that, upon conducting the experiment, the outcome will be the element
in question. We require the following to be true of the probability function, P;
1. Probabilities are weakly positive; P(A) 0 A where A is a subset of the
sample space.
2. The probability of an outcome being a member of the sample space is unity;
P(S) = 1.
3. For non-overlapping subsets of the sample space, the probability of their union
is the sum of their individual probabilities; P (

i=1
A
i
) =

i=1
P(A
i
)
15
2.1.1 Example: rolling dice
For example, we can consider our experiment to be the rolling of a six-sided, unbiased
die.
Then the sample space consists of the set of possible numbers that can be rolled and
the three preceding conditions are obviously lled.
The probability of rolling any particular number between 1 and 6 is one sixth. The
probability of rolling a number between one and six (inclusive) is unity.
2.1.2 Further conditions on probabilities
From the preceding conditions we can derive further rules that the probability func-
tion must satisfy;
1. The probability of the empty set of outcomes is zero.
2. Any event or subset of the sample space must have probability weakly less than
one; P(A) 1 A S.
3. Take any event or set in S, call it A. Then the probability of the complement
of A is just one minus the probability of A; P(A
c
) = 1 P(A).
So we now have an experiment with a variety of outcomes and we understand certain
restrictions that the probability function must satisfy. From our current set of rules
that probabilities must obey we can go on to derive a bunch more but well stop here
for the time-being.
2.1.3 Random variables
A random variable is just a mapping from our sample space onto the real line. It
takes the experiment that has been run and assigns, for each outcome, a numerical
value.
For example, when tossing a coin, we might dene a new variable X to be -1 in the
event of a head being tossed and +1 for the case of tails.
16
We can derive the probability law for our random variable by taking the original
probability function for the sample space, S, and determining the implied probabili-
ties for the values of X. In our coin tossing example this is particularly easy as each
event in the sample space is associated with a unique value of the random variable.
Its a bit more tricky if the mapping between the sample space and random variable
is not one-to-one.
2.1.4 CDFs and PDFs
We now have a random variable, X, that has values situated on the real line. The
cumulative distribution function of X is dened as follows;
Cumulative distribution function: the CDF of the random variable X
is;
F(x) = P(X x) x (2.1)
The CDF tells us, for a given value of the random variable, the total probability of
observing the random variable at that value or lower.
F(x) is bounded below at zero and above at unity such that a plot of the CDF must
increase from zero to one as one moves rightwards along the real line.
The CDF can clearly never decrease as one moves rightwards along the real line.
The CDF also tells us whether a random variable (RV) is continuous or discrete. A
discrete random variable possesses a CDF which is a step function while a continuous
RV has a smoothly increasing CDF.
2.1.5 Discrete random variables: the probability mass func-
tion
For a discrete RV, we dene the probability mass function as follows;
17
Probability mass function: for a discrete RV, X, the probability mass
function is dened by;
f(x) = P(X = x) x (2.2)
The mass function in the discrete case indicates the likelihood of a specied event.
Clearly, we can sum up this function to yield the CDF i.e.
F(k) = P(X k) =

xk
f(x)
2.1.6 Continuous random variables: the probability density
function
In the continuous RV case, we dene the probability density function as follows;
Probability density function: for a continuous RV, X, the probability
density function, f(), is dened by;
F(x) =
_
x

f(z)dz x (2.3)
Thus, in the continuous case were substituting integrals for sums but doing the
same kind of thing. Intuitively, the PDF tells us the probability that X lies in a
small region around a specic value x. The PDF is the function that one aggregates
in order to arrive at the CDF.
2.1.7 Features of PMFs and PDFs
Finally, some features of PDFs and PMFs are as follows;
18
1. PDFs and PMFs must both be non-negative i.e. in either case we have f(x)
0 x.
2. Integrating the PDF over the entire real line or summing a PMF over the
set of all possible events must yield unity; in the continuous case we have
_

f(x)dx = 1 and in the discrete case



x
f(x) = 1.
In what follows were going to concentrate on the continuous RV case as this covers
the vast majority of applications that well come across in nance theory.
2.2 Basic Statistics
The previous section introduced density and distribution functions of a continuous
RV which give complete descriptions of the probability law that governs the behaviour
of that RV. However, in many cases, it is enough to know certain summary properties
of the probability distribution rather than the entire distribution.
We start with the denition of the population mean or expectation;
= E(X) =
_
+

x f(x) dx (2.4)
The mean is the probability-weighted sum of values taken by the RV. It tells us
what the average value of the RV under examination is, having accounted for the
probabilities of each outcome. Note that, the mean may be innite.
2.2.1 Expectations and functions of RVs
We can perform a similar calculation to the above for an arbitrary function of our
RV. The expectation of g(X) is given by;
E(g(X)) =
_
+

g(x) f(x) dx (2.5)


From the denition of the mean and equation (2.5) it should be obvious that the
following rules are true for means. Given 2 constants, a and b;
19
E(a) = a
E(bX) = b E(X)
E(a +bX) = a +b E(X)
2.2.2 The Variance and Standard Deviation
We can now dene the variance of a continuous RV;

2
= Var(X) = E(X )
2
=
_
+

(X )
2
f(x) dx (2.6)
The variance tells us that how widely the values of the RV are spread around its
mean (in a probability weighted fashion). If the variance is large then values of the
RV tend to be widely dispersed around the mean while the converse is true if the
variance is small. Note that the variance must always be positive.
The standard deviation of a RV, , is simply dened as the square root of its variance.
2.2.3 Manipulating Variances
Again, given two constants a and b, the following rules are true;
Var(a) = 0
Var(bX) = b
2
Var(X)
Var(a +bX) = b
2
Var(X)
2.2.4 Skewness and Kurtosis
The nal two descriptive statistics for distributions that well introduce are both
based on central moments of the distribution. The rth central moment of a distribu-
tion is calculated as;
20

r
= E(X )
r
=
_
+

(X )
r
f(x) dx (2.7)
where is the mean. Note that the second central moment of the distribution is the
population variance.
The coecient of skewness of a continuous RV is the third central moment scaled by
the cube of the standard deviation i.e.

3
= E(X )
3
/
3
=
1

3
_
+

(X )
3
f(x) dx (2.8)
The coecient of skew is zero for symmetric distributions. Figure 2.1 shows 3 dis-
tributions, one with positive skew and the second with negative skew. The nal
distribution is symmetric and thus has zero skew.
Figure 2.1: Skewed distributions
P
r
o
b
a
b
i
l
i
t
y
Positive skew
Zero skew
Negative skew
Last of all, the coecient of kurtosis is dened to be the 4th central moment scaled
by the square of the variance. Thus;
21

4
= E(X )
4
/
4
=
1

4
_
+

(X )
4
f(x) dx (2.9)
Kurtosis is often used to judge whether a distribution has fat or thin tails. Fat (thin)
tailed distributions have large (small) values for the coecient of kurtosis. See Figure
2.2.
Figure 2.2: Kurtosis
!5 !4 !3 !2 !1 0 1 2 3 4 5
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
P
r
o
b
a
b
i
l
i
t
y
Thin!tailed
Fat!tailed
2.3 Multivariate probability and statistics
In the previous section we discussed ways to summarize the properties of a single
random variable. Perhaps more commonly in quantitative nance, we wish to char-
acterize relationship between 2 (or more) random variables.
The basic statistical object underlying the relationship between a pair (or larger set)
of RVs is their joint probability density which well denote by f(x, y). We have;
f(x, y) 0 x , y and
_
y
_
x
f(x, y) dx dy = 1 (2.10)
22
The joint density, evaluated at values x
0
and y
0
, can be intuitively thought of as
giving information about the probability that X lies in a small interval around x
0
and Y lies in a small region around y
0
.
2.3.1 The joint distribution and marginal distributions
Integrating the joint density gives the joint probability distribution, F(x, y), which
tells us about the total probability of observing a value of X less than or equal to x
and Y less than or equal to y;
F(x, y) = Pr(X x Y y) =
_
y

_
x

f(s, z) ds dz (2.11)
Also, from the joint density we can derive the univariate density of X alone. This is
called the marginal density of X and is dened by;
f
X
(x) =
_
+

f(x, y) dy (2.12)
2.3.2 The conditional density
Assume that were given the joint density of two variables, f(x, y) and were also told
that the value of Y is y
0
. The conditional density of X tells us how knowing that Y
equals y
0
aects the likely outcome for X. It is given by;
f(x | Y = y
0
) =
f(x, y
0
)
f
Y
(y
0
)
(2.13)
As y
0
is known and f
Y
(y
0
) a constant, f(x | Y = y
0
) is positive and integrates to one
so its a proper density. We can derive conditional expectations as follows;
E(X | Y = y
0
) =
_
+

x f(x | Y = y
0
) dx (2.14)
The conditional expectation tells us about the value for X one should expect to see
on average if one has already observed a value for Y of y
0
. Hence we could calculate,
for example, the expected movement in the FTSE-100 on a day upon which we know
that the Nikkei-225 has fallen by 100 points.
23
2.3.3 Moments and functions of two random variables
Deriving moments for functions of two (or more) variables is straightforward;
E[ G(X, Y ) ] =
_
y
_
x
G(x, y)f(x, y) dx dy
where G(x, y) is a function of random variables X and Y and f(x, y) is their joint
density. The most important example well come across is the covariance;
Cov(X, Y ) = E[(X
X
)(Y
Y
)] =
_
y
_
x
(x
X
)(y
Y
) f(x, y) dx dy (2.15)
where
X
= E(X) and
Y
= E(Y ).
2.3.4 Properties of Covariance
The covariance measures the degree of association between two variables and displays
the following properties;
If above average values of X and Y tend to be observed together and so do
below average values of X and Y then the covariance will be positive (X and
Y are positively associated).
If above average values of X tend to be observed at the same time as below
average values of Y and vice versa, then the covariance will be negative (the
variables are negatively associated).
If theres no clear relationship between X and Y then the covariance will be
close to zero.
From equation (2.15), Cov(X, X) = Var(X).
Again from equation (2.15), Cov(aX, bY ) = a b Cov(X, Y ).
24
2.3.5 Correlation
A weakness of the covariance is that its level depends on the units of X and Y .
Changing the units will change the value of the covariance. Hence, its common to
scale the covariance by the product of the standard deviations of X and Y . This
gives a scale independent measure of association, the correlation coecient, which
lies between negative and positive unity.
Formally the correlation coecient ((X, Y )) is dened as follows;
(X, Y ) =
Cov(X, Y )

X

Y
(2.16)
Positive correlations are interpreted in the same way as positive covariances and nega-
tive correlations in the same way as negative covariances. The strength of association
between two variables can be gauged by the magnitude of the correlation coecient.
2.3.6 Independence and un-correlatedness
Finally, it is useful to characterise how we might state the a pair of variables are
not related. If we nd two variables which have a correlation coecient of zero then
these variables are said to be uncorrelated. This means that the fact that X might
be above (or below) its mean tells us nothing about the likely behaviour of Y relative
to its mean.
A stronger notion of the absence of a relationship between random variables is inde-
pendence. The basic denition of independence is that the joint density of X and Y
is just the product of the marginal densities i.e.
f(x, y) = f
X
(x) f
Y
(y)
This immediately implies that the expectation of the product of X and Y is the
product of the expectations of X and Y i.e. E(X Y ) = E(X) E(Y ).
This further implies that;
Cov(X, Y ) = (X, Y ) = 0
25
Hence, independence implies uncorrelated-ness (but the converse is not true).
Finally, if X and Y are uncorrelated then the conditional density of X given Y is
just the marginal density of X. Knowing the level of Y gives us no extra information
regarding possible outcomes for X. Hence:
f(x | Y = y
0
) = f
X
(x)
As weve said, independence implies uncorrelatedness, but the converse is not true.
Independence implies that all arbitrary functions of X are uncorrelated with any
arbitrary function of Y clearly a much stronger idea.
2.3.7 Variances of combinations of RVs
Again consider our 2 variables, X and Y . The variance of a linear combination of
these variables is determined as follows;
Var(aX +bY ) = a
2
Var(X) + 2 a b Cov(X, Y ) +b
2
Var(Y )
So the variance of a combination of random variables is not just a combination of the
individual variances, but it also includes a covariance term (as one might expect).
Obviously, if X and Y are uncorrelated (or indeed independent), then this covariance
term is zero and the variance of the combination is just a combination of the variances.
2.4 Sample moments
We will now demonstrate how to calculate values for the univariate and multivariate
statistics weve described above from a sample of real data. For this purpose, presume
that we have observed the values of 2 variables at various points in time.
For example, the two variables might be daily observations on the market-closing
price of BT and Vodafone stock. Thus our sampling frequency in this example is
daily and lets assume that we have observed data on a total of T consecutive trading
days.
26
We will denote the price of BT stock on day t with X
t
and Vodafone stock with Y
t
where t = 1, 2, ..., T.
2.4.1 The Sample Mean
The sample mean, denoted

X, is given by;

X =
1
T
T

i=1
X
i
(2.17)
i.e. the sample mean is just the arithmetic average of the data. You should be able
to see the similarity of this equation with the denition of the population mean in
equation (2.4).
Note that points in the distribution of our RV with higher probability are more likely
to turn up in any sample of data, thus when we average all of our data points were
going to implicitly get some probability weighting just like in equation (2.4).
2.4.2 The distribution of the sample mean
Assume that the data are independent draws from a distribution with expected value
and variance
2
. As such, for each i we have E(X
i
) = and Var(X
i
) =
2
. Hence
the expected value and variance of the sample mean are as follows;
E(

X) =
1
T
T

i=1
= (2.18)
Var(

X) = Var
_
1
T
T

i=1
X
i
_
=
1
T
2
T

i=1

2
=

2
T
(2.19)
As T increases, the sample mean becomes a more precise estimate of the population
mean. These results are valid whatever the distribution of the X
i
.
27
2.4.3 The Central Limit Theorem
The CLT tells us that if we have a fairly large number of data points, the distri-
bution of the sample mean tends to be Normal i.e. as T gets large the following is
approximately true for the sample mean;

X N
_
,

2
T
_
Again, this result is true regardless of the true distribution of X. They could be
Exponentially distributed (see a couple of pages down the line for a denition) , for
example, but if our data sample consists of more the 50 observations (or so) then we
can treat the sample mean as approximately Normally distributed.
Of course, the approximation will be somewhat better if the original data are close
to Normal and somewhat worse if the original data are far from Normal.
2.4.4 Other sample moments
The sample variance of a set of data is constructed as follows;
s
2
=
1
T 1
T

i=1
_
X
i


X
_
2
(2.20)
The sample standard deviation (s) is just the square root of the sample variance.
The sample skew and sample kurtosis of a given set of data are dened in analogous
manner to the sample variance. The third and fourth sample moments of a set of
data are given by;
m
3
=
1
T 1
T

i=1
_
X
i


X
_
3
, m
4
=
1
T 1
T

i=1
_
X
i


X
_
4
Then the sample versions of the coecients of skewness and kurtosis are calculated
as
m
3
s
3
and
m
4
s
4
respectively.
Finally, moving to multivariate statistics, the sample covariance and correlation co-
ecient are calculated as shown below;
28

Cov(X, Y ) =
1
T 1
T

i=1
_
X
i


X
_ _
Y
i


Y
_
, (X, Y ) =

Cov(X, Y )
s
X
s
Y
2.5 Some univariate probability distributions
If someone has come across only one continuous distribution before then its very
likely that the distribution in question is the Normal (sometimes called the
Gaussian) distribution. If a variable Y is normally distributed then its density is
as follows;
f(y) =
1

2
exp
_

1
2
_
y

_
2
_
, where < < , > 0 (2.21)
Note that two parameters, and , completely dene the Normal and, hence, well be
using the following notation as shorthand for a Normally distributed random variable;
Y N(,
2
).
With regard to these two parameters, it turns out that the following is true;
E(y) = , Var(y) =
2
Hence the two parameters of the Normal specify the mean and standard deviation
respectively. We can go on to show the following for the Normal;

3
= 0 (2.22)

4
= 3 (2.23)
Both of these results hold regardless of the values taken by and . As the coecient
of skewness is zero, the Normal is symmetric. The coecient of kurtosis is three. This
value is used as a reference point by statisticians to judge whether a distribution has
fat or thin tails.
29
2.5.1 The
2
distribution
In economics and nance, we often work under the assumption that the variables/data
were interested in are Normally distributed. A consequence of this is that the
2
distribution appears quite frequently.
A RV has a
2
distribution with k degrees of freedom when it is the sum of the
squares of k independent, standard Normal distributions. Its PDF is;
f(y) =
1
(k/2) 2
k/2
y
k/21
exp(y/2) , y > 0 (2.24)
Again k, the single parameter of the distribution is called the degrees of freedom and
will be a positive integer. () is the Gamma function.
The mean of a
2
k
is k and the variance is 2k. Also, for the
2
, the coecient of
skewness (
3
) exceeds zero i.e. it is positively skewed.
2.5.2 The Student-t distribution
The Student-t is also a relative of the Normal distribution. In recent times, the
Student-t has also become a popular choice when choosing a distribution to t nan-
cial return data. The main reason for this is that it is fat-tailed.
A variable is Student-t distributed with k degrees of freedom if it the ratio of a
standard Normal variable to the square root of a
2
k
variable divided by k. That is,
Y is t
k
distributed if;
Y = Z/
_
V/k (2.25)
where Z is standard normal and V is distributed
2
k
. Using this denition we get the
following PDF;
f(y) =
((k + 1)/2)
(k/2)
1

k
1
(1 +x
2
/k)
(k+1)/2
(2.26)
where, again, () is the Gamma function.
30
Note that, as k becomes large, then the t
k
converges towards the standard Normal
distribution. The mean and variance of the t
k
are as given below;
E(X) = 0 for k > 1 , Var(X) =
k
k 2
for k > 2
2.5.3 The Exponential Distribution
The nal distribution that well introduce is (like the
2
) dened only for positive
values. It is the Exponential distribution. A random variable, X, is exponentially
distributed if it has the following PDF;
f(x) = e
x
, for x 0 and > 0 (2.27)
The Exponential distribution has a single parameter, , that determines its charac-
teristics. Indeed, its easy to show that;
E(X) =
1

and Var(X) =
1

2
Hence, for an Exponential random variable, the mean is the square root of the vari-
ance.
31
32
CHAPTER 3
Present values
Our initial analysis focusses on two areas;
Review of compounding, discounting and the time-value of money.
An introduction to Fixed Income Securities
In the next chapter, we will build on this material by discussing the valuation of xed
income securities and some issues from xed income portfolio management.
3.1 Present values and the time-value of money
The rst substantial topic for our analysis is the nature and valuation of xed-income
securities i.e. bonds. However, before we discuss bond valuation we will need to
review the mechanics of discounting and the computation of present and future values
of cashow streams.
Basic situation: an investor has X and decides to place the money in an interest
paying account. Lets assume that the rate of interest on the account is r per period.
33
After k periods, the amount that the investor has in his account depends upon the
way in which interest is computed for that account. There are two common methods
of computing interest.
3.1.1 Simple interest
Proportional interest is paid every period, but only on the initial balance in the
account. Thus, with an initial deposit of X, every period the investor earns an
interest payment of r X. Hence, after k periods, the total balance in the account (V )
is equal to;
V = (1 +r k) X
3.1.2 Compound interest
Every period, interest is paid proportional to the current balance in the account (i.e.
the initial balance plus any accrued interest to date). Thus, at the end of that period
the total amount in the account is equal to (1 +r) times the opening balance.
After k periods, the value of the account is;
V = X (1 +r)
k
Obviously, compound interest is more familiar than simple interest. It is the way our
bank accounts operate, for example. In what follows we will concentrate largely on
the compound interest case.
3.1.3 Compounding for sub-periods
In many situations, one is told a nominal interest rate for a period, but compounding
occurs several times within the period. For example, banks often publicize a nominal
annual interest rate but interest is paid monthly.
34
Sub-periods and interest rates: in a case where a nominal annual in-
terest rate is given as r, but compounding occurs n times per year. The
actual interest rate used in compounding is;
r
n
Here the eective annual rate of interest (EAR) earned exceeds the nominal rate.
The nominal rate is often called the Annual Percentage Rate (APR).
Consider a case where the nominal rate is r, the initial deposit in the account is X
and compound interest is computed and paid n times during each period. In this
case, the balance in the account after one period is equal to;
V = X
_
1 +
r
n
_
n
In the case where interest is only paid at the end of every period, the balance in the
account after one period would be;
V = X (1 +r)
It is straightforward to show that, whatever the level of the interest rate, the following
inequality holds;
_
1 +
r
n
_
n
> (1 +r)
Thus the payment of interest within periods leads to a greater balance in the account
than if interest was only paid at the end of every period. The eective rate of interest,
denoted r

, can be calculated as follows;


1 +r

=
_
1 +
r
n
_
n
As an example, dene a period to be one year and consider a bank account paying
a nominal interest rate of 9%. Interest is computed and paid every month such that
35
n = 12. Here, the eective rate is given by;
1 +r

=
_
1 +
0.09
12
_
12
= 1.0938
Hence, in this setting the investor eectively earns an annual interest rate of 9.38%.
We will call this eective rate the equivalent annual rate in order to distinguish it
from the nominal annual rate of 9%.
3.1.4 Continuous compounding
What happens to the eective interest rate an investor would earn in a case where the
number of sub-periods within a period tends to innity? This is known as continuous
compounding, for obvious reasons.
To consider this case, we take the analysis above and let n . This leads to
the following expression for the eective interest rate under continuous compounding
(r

);
1 +r

= lim
n
_
1 +
r
n
_
n
= e
r
where e is the base of the natural logarithm (i.e. 2.71828 ...). Hence, if the nominal
annualized rate was 8% and compounding was continuous, after one year an investor
would earn an eective annual rate of 8.33% on any deposit.
3.1.5 Compounding and future values
Finally, we can use the idea of compound interest to dene the notion of a future
value. The future value of an investment at interest rate r, k periods from the present,
is the amount that the initial investment would compound up to in k periods time.
If, for the time-being, we assume that compounding only occurs at the end of each
period then it is obvious that the future value (FV ) of an initial investment of X for
k periods at nominal rate r is given by;
FV (r, k) = X (1 +r)
k
36
Here I have written the future value as a function of r and k explicitly to emphasize
that varying these numbers alters the future value. Thus, the future value of 100,
at nominal rate 5%, 8 years from today is 147.75.
3.2 Discounting and the time value of money
We can use the logic of compound interest and future values in reverse to compute
present values using a technique called discounting. Discounting relies on the time-
value of money to allow us to compare quantities of cash earned at a known point in
the future with amounts of cash earned today.
To introduce this idea, consider the following scenario. The current annual inter-
est rate on all deposits is 8%. You are oered the choice between two investment
alternatives;
Option A delivers 100 today
Option B delivers 108 in one years time.
Which would you choose? How one should compare cashows receivable at dierent
dates?
Analysis:
Note that one can take the money delivered by the rst option and place it on
deposit for one year. Then one can directly compare the two cashows as they
arrive at the same date.
Accordingly, the 100 received today, deposited for one year at 8%, would grow
to a future value of exactly 108 in one years time.
This is exactly equal to the cashow from the second option and, thus, the two
options are equivalent with a prevailing interest rate of 8%, 100 today is
equivalent to 108 in one years time.
The fact that money earned today is equivalent to a larger sum of money earned in
the future is known as the time-value of money and is a direct consequence of positive
interest rates.
37
As weve already said, we can run this argument in reverse to compare current and
future cashows also. The argument would be as follows.
The second option promises us 108 in one years time.
Given interest rates of 8%, the amount of money we would need to deposit today
in order to generate 108 in a years time is exactly equal to 100 100 is
the present value of a promise of 108 receivable in one year.
Then, as the present value of the second option is identical to the immediate
cashow promised by the rst option, we again argue that the two investments
are equivalent.
3.2.1 Present values
The present value of a future cashow is the amount of money one would need to
place on deposit today to generate that future cashow at the same date.
If that future cashow is X, it will arrive in k periods and the current nominal
period interest rate is r, then the present value is;
PV (r, k) =
X
(1 +r)
k
All were doing here is deating the future payment by the compound interest factor
that would have accrued through its lifetime. This present value can then be directly
compared with a second present value in order to provide a ranking. The process
described above is also known as discounting.
The quantity
X
(1+r)
k
is often referred to as the discounted value of X.
Similarly,
1
(1+r)
k
is often called the k period discount factor (at interest rate r).
Obviously, the present value of a given amount to be received in k years time
is just the product of that amount and the k period discount factor.
38
3.2.2 Discounting example
Assume we are oered the choice between option A, a promise of 200 to be received
in 5 years, and option B which guarantees 175 in three years. If interest rates are
10% which should be chosen? Well the present value of A is;
PV
A
=
200
(1.10)
5
= 124.18
Similarly, the present value of B is;
PV
B
=
175
(1.10)
3
= 131.48
Thus option B is preferable it oers a smaller nominal amount but the money is
received closer to today this is the time-value of money again.
3.2.3 Present values of streams of payments
Now, instead of assuming that an investor is to receive a single future cashow,
assume that he is to receive a stream of payments one payment each period for the
next k periods. Denote the cashow to be received j periods hence with X
j
. How
does one compute the present value of this entire stream?
The present value of the stream of payments is just the sum of the individual present
values. Thus, with interest rate r, the present value is;
PV =
X
1
(1 +r)
+
X
2
(1 +r)
2
+
X
3
(1 +r)
3
+... =
k

i=1
X
i
(1 +r)
i
The situation is slightly more straightforward if we assume that the nominal cashows
to be received in the future are the same for every date. If we call this value X then
the present value simplies to;
PV =
k

i=1
X
(1 +r)
i
=
X
r

_
1
1
(1 +r)
k
_
This type of cashow stream is known an annuity. An even more straightforward
39
cashow stream to value is the perpetuity. This stream is like an annuity except the
payments of X per period go on for an innite number of periods. Thus, in this case,
the present value is;
PV =

i=1
X
(1 +r)
i
=
X
r
Note the similarity between the formulae for the present values of the annuity and
perpetuity. One can view a perpetuity as an innitely lived annuity, i.e. an annuity
where k . If we let k tend to innity in the annuity formula, we get the present
value formula for the perpetuity.
3.2.4 Examples
Below are a few examples of present value calculations using the formulae from the
preceding subsection. Assume in all cases that nominal interest rates are xed at 8%.
1. A individual holds an investment that will deliver 350 in one years time, 200
in two years and 100 in the third year. No further cashows are to be paid to
the investor. What is the present value of this investment? Given an interest
rate of 8% then the present value is as follows;
PV =
350
1.08
+
200
1.08
2
+
100
1.08
3
= 574.93
2. A nancial security promises annual payments of 50 for 12 years. How much
would one pay for this asset? Well, the present value of this annuity is the fair
price for the asset. The present value is;
PV =
50
0.08

_
1
1
(1.08)
12
_
= 376.80
3. Another asset promises payments of 12, once a year forever. What is the fair
price for this asset? Again, the fair price is the present value of the perpetuity.
It is;
PV =
12
0.08
= 150
40
3.2.5 Annuity example
Scenario: you have just retired at the age of 65. Through your working life, you
have accumulated cash which amounts to 250,000 at the current date. You wish
to buy an annuity which gives you a constant income for the next 20 years using
200,000 of your bank balance.
Question: if interest rates are 6% per annum, what constant income will such an
annuity deliver?
Answer: using our annuity valuation formula we have;
200, 000 =
X
0.06

_
1
1
(1 + 0.06)
20
_
Rearranging tells us that X, the annual income from the annuity, is 17436.91.
3.2.6 Present values and uncertainty
Until now we have assumed that all cashows to be discounted/compounded are
known precisely. How do we deal with uncertainty?
We use exactly the same procedure except;
Substitute expected cashows for actual cashows in the calculations.
Use a risk-adjusted discount rate rather than a risk-free interest rate.
Where does the risk-adjusted discount rate come from?
In later lectures on risky securities (e.g. equities) we will discuss how to derive such
rates. A key input to their determination will be individuals attitudes towards risk.
3.3 Summary
We have reviewed the basic ideas behind interest calculations, compounding and
discounting. These ideas will be used in what follows to derive values for xed
41
income securities, such as government and corporate bonds.
It should be noted that throughout our analysis we have assumed a single interest rate
for all individuals and for borrowers and lenders. Obviously, in the real world, this
symmetry does not hold borrowers pay higher interest rates than lenders receive
and less creditworthy individuals pay higher rates than more prudent individuals.
These observations in no way invalidate the analysis developed above, they simply
imply that when computing present or future values one must employ the appropriate
rate of interest in the given context.
3.4 Fixed Income Securities and Markets
Basic denition: xed income securities are so named as the cashows they deliver
to an investor, as well as the dates that these cashows will arrive, tend to be known
(xed) in advance. The more common name for such securities is bonds.
Issuers: bonds can be issued by a variety of entities. Governments issue bonds to
nance their spending, as do local governments. Corporations issue bonds to nance
investment projects. Financial institutions issue bonds also.
In what follows in this lecture we will review the basic types of bond before looking
at how to value them. For valuation we will make use of the present value techniques
discussed earlier. Finally we will discuss corporate bond ratings and some topics
related to the management of bond portfolios.
3.4.1 Size of Bond Markets
Before we start, however, let us give some information on the relative importance of
the segments of the xed income markets and a comparison with equity markets.
Treasury debt (i.e. US government debt) and corporate debt are very impor-
tant.
Interestingly, though, both are smaller than the mortgage-backed segment of
the bond market. These bonds are created by nancial institutions, who pool
42
Table 3.1: US bond markets: outstanding debt (30/06/2006)
Segment Issuer Debt ($ trillions)
Municipal Local Gov. 2.3
Treasury Treasury 4.2
Mortgage-related Private 6.2
Corporate Corporations 5.2
Federal agency Federal agencies 2.7
Money market Corporations, Financial inst. 3.7
Asset backed Financial institutions 2.0
Total 26.4
Table 3.2: US equity markets: total market capitalization (31/12/2005)
Exchange Market cap. ($ trillions)
Nasdaq 3.9
NYSE 13.3
Amex 0.5
Total 17.7
payments from a large number of individual mortgage contracts and redistribute
these payments as bond coupons.
Total US debt outstanding in 1996 was around $12bn., such that the size of
the market has more than doubled in the last decade. Much of this growth has
come from mortgage-related and corporate bonds.
For comparison, as of end 2005, the total market cap of the three main US equity
exchanges (NYSE, Nasdaq and AMEX) is shown in Table 3.2.
The key point regarding this table comes from comparing total equity market cap
with total debt in issue. The amount of wealth invested in bond markets is well over
50% larger than that in invested in US equity markets. Thus, while the nance media
tends to focus on stock market uctuations, our gures tell us that bond markets are
deserving of at least as much, if not more, attention.
43
3.4.2 Types of bond
Lets start by providing a generic denition for a xed-income security;
Denition of a bond: a bond is a security issued by a borrower, i.e.
the issuer, to a lender which obligates the borrower to make pre-specied
payments to the lender at pre-specied future dates. Upon issue the lender
is obliged to pay the borrower the bond price.
In making the preceding denition more concrete, there are a number of factors
that must be specied. The identities of the issuer and the purchaser are the most
obvious features. Then we must specify the size and timing of the payments that
the bond delivers. It is in specifying the payment structure that we can distinguish
several common types of bond.
3.4.3 Zero coupon (discount) bonds
A zero-coupon bond promises a single payment, known as the face value or par value,
at a specied future date, called the maturity date.
Thus, the lender pays the issuer the bond price today in return for a single repayment
at a given date.
Example, zero-coupon bond: a zero-coupon bond with face value 100
and 7 years to maturity has a single cashow of 100 that will be paid by
the issuer to the holder in 7 years time.
44
3.4.4 Coupon bonds
These bonds deliver periodic cashows, called coupon payments, until maturity. At
maturity a nal coupon is paid along with the face value of the bond.
The frequency of the coupon could be set at any level but the majority of US bonds
have a semi-annual coupon frequency.
European bonds tend to pay coupons annually.
Example, coupon bond: Consider a 3 year bond, with semi-annual
coupons, face value $100 and coupon rate 8%. This bond obligates the
issuer to pay the following cashows to the holder;
Years from now Cashow
0.5 $4
1.0 $4
1.5 $4
2.0 $4
2.5 $4
3.0 $104
Coupon payments are calculated as follows; multiply the coupon rate of
8% with the face value of $100. This yields $8 which is the annual coupon
payment. For a bond with semi-annual coupon payments, half of the annual
coupon is paid every six months.
3.4.5 Coupon bonds as portfolios of zero-coupon bonds
Let us assume a world in which zero-coupon bonds, with face value $1, are available
at all possible maturities.
Then it should be fairly obvious that we can view any coupon bond as a portfolio of
zero-coupon bonds with various maturities. This is useful, as if we know the prices
of the $1 zero-coupon bonds, we can use the portfolio relationship to infer a price for
45
a specic coupon bond.
In practice, especially for government issued bonds, the process of forming a coupon
bond as a portfolio of zeroes is often practiced in reverse. This is called stripping and
it consists of exchanging a coupon bond for the portfolio of zeroes that replicate its
cashows. These individual zeroes can then be sold separately.
For illustration, consider the example from the previous subsection.
Example, coupon bonds and zeroes: our coupon bond delivers a cash-
ow of $4 every six months for the next 3 years and also delivers an extra
$100 on the maturity date. If $1 zero-coupon bonds exist at all of the
required maturities then our coupon bond is identical to the following port-
folio of zeroes;
Maturity Number of zeroes held
0.5 4
1.0 4
1.5 4
2.0 4
2.5 4
3.0 104
By construction, this portfolio and the coupon bond have the same payo
at each date.
3.4.6 Slightly more exotic bonds
Below are a few of the more common variations on standard bond structures;
1. Perpetual bonds: never mature. They are perpetuities. They pay coupons
at a specied rate forever. Known as Consols in the UK.
2. Floating rate bonds: in the example on the previous slide the coupon was a
xed percentage of the face value. Many bonds have coupon rates which vary
46
over the bonds lifetime. Generally, the oating coupon rate is set at a premium
over some market interest rate (e.g LIBOR or the U.S. T-bill rate) and is reset
on a pre-specied basis.
3. Index-linked bonds: coupons and principal grow in line with ination (in the
relevant country). First issued in the U.K. and issued increasingly frequently
by Governments. As such, they can be thought of as real, risk-free securities
(although in most cases indexation is not perfect.)
3.4.7 Bonds and Geography
We can also distinguish bonds by the geographical locations in which the issuer
resides, in which the issue is made and in which the issue is traded.
A foreign bond is a bond issued by a borrower in a country dierent from that
borrowers country of origin (i.e. the borrower is selling debt abroad.) The bond
is denominated in the currency of the country in which it is sold. Hence, if a
Japanese rm sells Sterling denominated debt in the U.K. it has issued foreign
bonds. Such Sterling denominated foreign bonds are colloquially known as
bulldog bonds.
Eurobonds are bonds denominated in the currency of one country but actually
sold or traded in another, dierent country. So, for example, a Eurosterling
bond will be denominated in Sterling but sold outside the U.K. The U.K. is
one of the major global Eurobond markets.
3.4.8 Bonds with option-like features
Finally, certain types of bond have option-like features to them. Either the holder
or the issuer has some right to change the nature of the contract at certain points
during its lifetime.
Callable bonds can be repaid early (i.e. before maturity) by the issuer if he so
chooses. Early repayment might be restricted to a specied date (European)
or may be allowed at any time prior to maturity (American).
47
Puttable bonds give control over the redemption date to the holder rather than
the issuer.
Convertible bonds: corporations sometimes issue debt which (either at a specic
date or at any time) can be converted into a share in the rms equity. As such,
this type of debt allows bondholders, as well as shareholders, to participate in
upside gains to rm value.
3.4.9 Varieties of Government debt
The names and features of bonds issued to nance government debt obviously vary
with the country in question. Below we focus on the most familiar cases.
UK: government bonds are issued by the Debt Management Oce and known
as Gilts
US: bonds are generically called Treasury securities. Long term securities are
Treasury bonds, medium term securities are Treasury notes and short term
securities are called Treasury bills.
Germany: long term government bonds are called Bunds, medium term bonds
are Bobls and short term notes are called Schatze.
48
CHAPTER 4
Fixed income securities
Where we are: in the previous chapter we discussed present value calculations
before venturing on to introduce the basic features of xed income securities and the
most commonly traded securities of this type.
Where we are going: here we will build on our understanding of the basic structure
of bonds to discuss their valuation. We make use of the present value techniques from
lecture 1 plus some standard absence of arbitrage arguments. We go on to talk about
corporate bond ratings and some concepts from bond portfolio management.
4.1 Bonds: valuation and the term-structure
For now we will focus on bonds issued by governments as they are easiest to deal
with. The reason for this is that, for major, developed nations at least, the bonds
are free from default risk. This is the risk that the borrower will be unable to make
the payments specied in the bond contract. For a corporation this risk is very real
but for the governments of the U.K., the US or Japan, for example, the possibility
that the government will not be able to repay its debts is eectively non-existent.
49
UK government debt: issued and managed by the Debt Management Oce, an
arm of H.M. Treasury. At the end of July 2006, UK public sector net debt was around
465.4 billion, which is approximately 36.5% of UK GDP. UK government bonds are
called Gilts.
US national debt: administered by the Bureau of the Public Debt, an arm of the
US Department of the Treasury. As of mid-August 2006, US national debt totaled
around $8.5 trillion. This is approximately 65% of US GDP. US government debt is
made up of issues of various US treasury securities. Treasury bills are zero-coupon
instruments with maturities of less than one year. Treasury notes have maturities
of 2 to 10 years and pay semi-annual coupons. Finally, treasury bonds are coupon
paying securities with maturities larger than 10 years.
4.1.1 The term-structure of interest rates
To value our bonds we are going to make use of the discounting framework introduced
previously. However, we must add one complication to that analysis before it becomes
useful to us.
The complication that we need to introduce is the fact that the interest rate relevant
to a payment t periods in the future can dier from the interest rate relevant for
payments made s periods in the future. We call the interest rate relevant to maturity
t the t-year spot rate of interest.
In the last lecture, we dened the t-period discount factor as follows;
d
t
=
1
(1 +r)
t
If we allow spot rates to vary with maturity (t), this means that our discount rate
(r) now needs a time subscript. The discount fact now becomes;
d
t
=
1
(1 +r
t
)
t
Clearly, to value a bond with several future cashows, we will need to know the spot
50
rate for a all of those maturities. A plot of the spot rate as a function of maturity
is called the term structure of interest rates. It tells us how heavily (or lightly) we
should discount money received at various points in the future.
The term structure is not static over time. Changes in its position and shape occur
continuously, due to economic and market events. On average, the term structure
is upward sloping larger discount rates should be used to discount cashows to
be received a long way in the future relative to payments received close to today.
During certain periods, mainly those of economic recession, term structures may
slope downwards.
4.1.2 Valuation of zero-coupon bonds
We start with the most simple valuation example the zero coupon bond. As pre-
viously discussed, a zero only delivers one cashow, at a given future date. To value
the bond we simply compute the present value of the cashow it generates. This
present value must be equal to the market price of the bond.
If we dene the face value of our zero coupon to be X, then the price of a zero with
maturity T is given by;
P
T
=
X
(1 +r
T
)
T
(4.1)
Obviously, given positive interest rates, the price of a zero is always below face value.
Thus, the bond always trades at a discount to face value (which gives the bond its
alternative name of a discount bond).
Example: if the 4-year spot rate is 4.5%, the price of a zero coupon bond
with 4 years to maturity and a face value of 100 is as follows;
P
4
=
100
1.045
4
= 83.56
51
4.1.3 Valuation of coupon bonds: annual coupon payment
In principle, valuation of coupon bonds is no dierent to valuation of zero-coupon
bonds. The market price of the bond must be equal to the present value of its
cashows. Thus, if we have a bond with annual coupon payments at coupon rate c,
face value X and maturity T the price is;
P
T
=
c X
(1 +r
1
)
+
c X
(1 +r
2
)
2
+... +
(1 +c) X
(1 +r
T
)
T
=
T

t=1
c X
(1 +r
t
)
t
+
X
(1 +r
T
)
T
Unlike the zero-coupon case, coupon bond prices can be below, at or above the face
value of the bond.
When the price of a bond is trading below face value then it is said to be trading at
a discount. When a bonds price is above face value then it is trading at a premium.
Finally, if a bonds price is exactly its face value then the bond is said to be trading
at par.
Example: assume that one, two and three years spot rates are 4%, 4.25%
and 5% respectively. the price of a bond with face value 500, 3 years to
maturity and annual coupon payments at rate 6% can be written down as
follows. Noting that the actual annual coupon payment will be 0.06500 =
30 we have;
P
3
=
30
1.04
+
30
1.0425
2
+
530
1.05
3
= 514.28
Note that this bond is trading at a premium.
52
4.1.4 Valuation of coupon bonds: semi-annual coupon pay-
ment
For the case of semi-annual coupon payments, convention dictates that the discount
rate to be used for a semi-annual period is half of the APR. Hence, if a six month
spot rate was quoted at an APR of 5.4%, we would actually use a rate of 2.7% to
discount any payment to be received in six months.
Thus, if the annual coupon rate on a bond with T periods to maturity is c, but
coupons are paid semi-annually then the bond price is;
P
T
=
0.5 c X
(1 +
r
0.5
2
)
+
0.5 c X
(1 +
r
1
2
)
2
... +
(1 + 0.5c) X
(1 +
r
T
2
)
2T
4.1.5 Zero-coupon bonds and the term structure
Earlier we used the term-structure of interest rates to determine the price of zero-
coupon bonds. Were now going to reverse the logic used there to show how we
can retrieve the term structure of interest rates from the range of zero-coupon bond
prices.
First, recall the pricing equation for a zero coupon bond with face value X and
maturity T;
P
T
=
X
(1 +r
T
)
T
Note that, as we know the face value, if we observe the market price of the bond
(P
T
), then the only unknown in this equation is the T period spot rate.
Thus, by rearranging our equation we can tie the implied spot rate down exactly;
r
T
=
_
X
P
T
_
1
T
1 (4.2)
Now, given zero-coupon bond prices for a range of maturities, we can calculate spot
rates for those same maturities in very straightforward fashion. The set of calculated
53
spot rates can then be used to value new instruments . A very useful construction
in this pricing is the maturity T discount factor (d
T
) implied by the price of the
maturity T zero coupon bond;
d
T
=
1
(1 +r
T
)
T
=
P
T
X
(4.3)
d
T
is the current value of 1 to be received in T periods.
4.1.6 Valuing a bond by replication
Consider a bond with maturity T and an arbitrary set of cashows from t = 1, .., T.
Call the time t cashow C
t
. Now, the value of this bond using the standard method
of discounting cashows by the term-structure of interest rates is as follows;
P =
T

t=1
C
t
(1 +r
t
)
t
However, we can use the prices of zero-coupon bonds, more specically the discount
factors we derived from them to write down this price in a dierent fashion.
P =
T

t=1
C
t
d
t
=
T

t=1
C
t
P
Z
t
where P
Z
t
is the price of a $1 T-period zero-coupon bond. As the nal equality makes
clear, as the discount factors are just the ratio of the relevant zero-coupon bond price
to its face value, this equation is implicitly a linear relationship between the price of
our new bond and zero-coupon bond prices.
This is an application of the principle of absence of arbitrage. An arbitrage is a
portfolio that delivers money from nothing. In well-functioning nancial markets
such portfolios should not exist (as greedy investors should trade them until they
are no longer protable). Thus, one method used to price certain nancial securities
is to use the notion that arbitrages should not exist. This is exactly what we are
doing here. If our new bond traded at any price other than that implied by the set
of zero-coupon bond prices then there would be an opportunity for a trader to make
risk-less trading prots.
54
4.1.7 Example: coupon bond valuation
Consider a bond with face value 100 and 2 periods to maturity. The annual coupon
rate is 6.5% and coupons are paid annually. The cashows to be received from the
bond are thus;
Time Cashow
t + 1 6.50
t + 2 106.50
Let us now assume that there exist zero-coupon bonds with face values 100 and
maturities of 1 and 2 periods. The prices of these are 93.46 and 86.94 respectively.
Given our prior denition, the one and two period discount factors implied by these
zero coupon bond prices are 0.9346 and 0.8694 (i.e. just the ratio of the prices to the
face values).
Its now easy to price our coupon bond. This bond is a portfolio of 0.065 units of the
maturity 1 zero coupon bond and 1.065 units of the maturity 2 zero. Thus its price
should be;
P = 0.065 93.46 + 1.065 86.94 = 98.66
Alternatively, using the discount factors we derived and our earlier formula;
P =
T

t=1
C
t
d
t
= 6.5 0.9346 + 106.5 0.8694 = 98.66
What would be true if the coupon bond sold at a price dierent to 98.66? Suppose
that the market price for this bond was 98. Then, the coupon bond would be
undervalued relative to the zeroes. If we were to buy the coupon bond and sell 0.065
units of the maturity one zero and 1.065 units of the maturity two zero then we would
generate the following cashows;
55
Time Coupon bond Zeroes Total
t -98 98.66 0.66
t + 1 6.5 - 6.5 0
t + 2 106.5 -106.5 0
From this trade we make 0.66 today for no future risk an arbitrage prot. Only
when the coupon bond price is exactly 98.66 do no arbitrage opportunities exist.
4.1.8 Coupon bonds and the term-structure
We can also derive the term-structure of interest rates from the prices of coupon
bonds of various maturities.
The job is not as easy as when we are using zero-coupon bonds as coupon bonds are
more complicated instruments, paying monies at various dierent dates. However,
the principle is exactly the same.
If we know the cashows and cashow dates for a variety of bonds, plus their market
prices, we should be able to work out the spot rates that were used to discount the
cashows in order to get prices.
Using the prices of coupon bonds to derive spot rates is often referred to as bootstrap-
ping the term structure. Below is a simple example.
4.1.9 Example: bootstrapping the term structure
Assume we know the prices of two bonds. The rst is a zero-coupon bond with face
value 100 and price, denoted by P
1
, of 93.46.
The second is a coupon bond with face value 100, two periods to maturity and a
coupon rate of 8%. Coupons are paid annually. The price of the coupon bond, P
2
, is
101.59.
What are the one and two period spot rates?
Well the one period spot rate is easy to deduce from the price of the one period zero.
Using equation (4.2) we get;
56
r
1
=
_
X
P
1
_
1
1
1 ==
_
100
93.46
_
1 = 0.07
Thus the one-period spot rate is 7%. Now we know the one period spot rate, we can
use the price of the two period zero and its cashows to deduce the two period spot
rate.
P
2
=
8
1.07
+
108
(1 +r
2
)
2
= 101.59 r
2
= 0.07125
Now, if we also had the price of a coupon bond with 3 periods to maturity we could,
using our values for r
1
and r
2
, deduce r
3
and so on ....
4.2 Yield to maturity
Very frequently in reports in the nance media one will see bonds referenced not by
their price but by a concept called their yield to maturity. Under certain conditions,
the yield is related to the annual return one gets from holding a bond from today
until maturity.
Thus the yield is often used (and mis-used) to compare the returns one might get
from holding various bonds. As we will see later on, using the yield to maturity in
this way is dangerous.
Denition: yield to maturity: the YTM is the constant discount rate which
equates the present value of a bonds cashows with its market price. thus, if we are
looking at a bond with T periods to maturity, annual coupon payments at rate c and
face value X then the yield is the value for y that solves the following equation;
P
T
=
T1

t=1
c X
(1 +y)
t
+
(1 +c) X
(1 +y)
T
(4.4)
where P
T
is the price of the bond. Those who have done some work on capital
budgeting and project evaluation before will recognize this calculation the YTM is
the internal rate of return of the bond.
Note that in the case of semi-annual coupons then we begin by calculating the con-
57
stant, semi-annual rate that equates price with the discounted sum of the bonds
cashows. The YTM is double this semi-annual rate.
4.2.1 Features of YTMs
Things that should be understood about YTMs;
The YTM need not be equal to any current or future spot rate.
The YTM is, however, obviously related to the term structure of spot rates in
a fairly complex, non-linear fashion.
For a zero-coupon bond, though, the YTM is exactly equal to the spot rate for
the relevant maturity
Thus the YTM should be viewed as articial a ctional construct derived from the
bonds price and the underlying term structure.
Calculation of the YTM is not straightforward. Note that equation (4.4) involves
various powers of y such that solving for the yield can often only be done via some
iterative search procedure. An example is given below;
4.2.2 Example; YTM calculation
Assume that the market price of a bond with two-periods to maturity, annual coupon
rate 4% and face value 100 is 101.50. What is the YTM?
Start by guessing that the YTM is 3.5%. If we plug this value into the right-hand
side of equation (4.4) along with the bonds cashows we get;
4
1.035
+
104
1.035
2
= 100.95
The value derived is slightly lower than the market price of 101.50. Thus lets try
a smaller YTM guess of 3.25%. This gives;
4
1.0325
+
104
1.0325
2
= 101.43
58
Again, the result is smaller than the market price, but not only slightly so. Thus we
choose a slightly smaller estimate for the YTM, say 3.2% ......
Continuing this process we should home in on the correct value of the YTM, which
is approximately 3.2135%. For this value we have;
4
1.032135
+
104
1.032135
2
= 101.50
4.2.3 YTMs and coupon rates
There is a clear relationship between the YTM and the coupon rate based on com-
parison of the price of the bond and its face value. The following relationships hold;
If the price of the bond exceeds the face value, then the yield is less than the
coupon rate.
If the price of the bond is lower than face value, then the yield is above the
coupon rate.
If the price of the bond equals the face value, then the yield is identical to the
coupon rate.
4.2.4 Yields and returns
As equation (4.4) makes clear, the YTM is just a transform of a bonds price. Higher
bond prices lead to lower yields and vice versa. Under certain conditions we can view
the YTM as a measure of the return from holding the bond.
To see this, consider the following argument. I am holding a bond. Assume that
every time I receive a cashow from this bond I am able to re-invest that cashow so
that it earns rate y. At the bonds maturity date, how much money do I have? Well
using our previous notation, the quantity of money I have is;
c X
T1

t=1
(1 +y)
t
+ (1 +c) X
59
Now, via the denition of the YTM, the following is clearly true;
c X
T1

t=1
(1 +y)
t
+ (1 +c) X = P
T
(1 +y)
T
Hence I can view the yield as follows if I purchase the bond for P
T
and hold it
to maturity, investing all coupon payments at y, then the yield to maturity is the
annual return I make on my investment.
However, this notion of the YTM as the return on your investment is problematic.
Some issues are as follows;
The YTM is your investment return only if you hold the bond to maturity if
you sell the bond at any date prior to maturity then your return will depend
on the market price at the date on which you sell.
The YTM is only the return on holding to maturity if you can re-invest coupons
at the yield itself. If you cannot, then your actual return will dier from the
YTM. Obviously the rate at which you can invest cash at some future date
need not be equal to the yield at all and will depend on the term structure at
that time.
Finally, note that one should not use the YTM to compare dierent bonds. Only if
2 bonds have the same coupon rate and time to maturity can the yield be compared.
Assume, for example, that 2 bonds have diering coupon payments. All else equal,
these two bonds will have dierent YTMs. However, our denition of the YTM then
requires us to believe that we can re-invest the coupons from the two dierent bonds
at their own yield. This is of course not going to be true and thus should further
erode our condence in the YTM as a return concept.
4.3 Corporate Bonds
Corporate bonds are issued by rms to nance their investment and production
activities. Given that they are issued by companies rather than governments, the key
dierence between corporate and government bonds is that corporate bonds expose
the holder to default risk.
60
Default risk is the risk that the company is unable to deliver the promised cashows
of the bond at any point. Clearly the level of default risk to which a bond exposes
the holder depends on the underlying risk of the rms activities. Utility rms, for
example, are less likely to default than biotechnology rms.
4.3.1 Bond ratings and rating agencies
Certain commercial organisations help characterise the default risk associated with
bonds by providing credit ratings. The two main players in this market are Moodys
and Standard and Poors. They assign ratings to bonds such that highly rated bonds
are projected to have low default risk while very low rated bonds (junk bonds) are
believed to be quite likely to default.
4.3.2 Example: Moodys bond ratings
Rating Explanation
Aaa Gilt-edged
Aa High-quality
A Upper-medium grade (possible future weakness)
Baa Adequate security currently, but speculative characteristics
Ba Speculative
B No desirable investment characteristics
Caa Poor standing: in, or in danger of entering, default
Ca Highly speculative
C Lowest rated
Additionally: bonds in categories Aaa to Baa are called investment grade and the
rest are non-investment grade or junk. Both corporate and government debt is rated
using this classication.
61
4.3.3 Bond ratings and probability of default
Year Aaa Aa A Baa Ba B All
1989 0.00 0.61 0.00 0.60 2.98 9.21 2.42
1990 0.00 0.00 0.00 0.00 3.34 16.16 3.52
1991 0.00 0.00 0.00 0.28 5.29 14.71 3.29
1992 0.00 0.00 0.00 0.00 0.30 9.03 1.33
1993 0.00 0.00 0.00 0.00 0.55 5.79 0.96
1994 0.00 0.00 0.00 0.00 0.24 3.82 0.57
1995 0.00 0.00 0.00 0.00 0.68 4.83 1.07
1996 0.00 0.00 0.00 0.00 0.00 1.45 0.54
1997 0.00 0.00 0.00 0.00 0.19 2.12 0.68
1998 0.00 0.00 0.00 0.12 0.61 4.24 1.27
1999 0.00 0.00 0.00 0.11 1.12 5.69 2.19
2000 0.00 0.00 0.00 0.39 0.91 5.42 2.34
2001 0.00 0.00 0.00 0.30 1.19 9.35 3.77
Data from Moodys, all default probabilities are expressed in percentage points.
4.3.4 Default risk, bond prices and yields
Consider a scenario where one can choose between two bonds. These bonds have
identical face values, maturities and promise identical coupons. Bond A, however,
has a higher probability of default than bond B. Which of the two bonds would you
prefer?
Most people would agree that bond B is preferable. It oers exactly the same cash-
ows as A and the probability of the holder receiving those cashows is higher than for
bond A. By this logic, we would expect people to pay more for bond B than for bond
A. Thus, there is a negative relationship between default risk and the value/price of
the bond. Higher default risk should result in a lower price.
Yields are dened for corporate bonds in exactly the same way as for government
bonds. Thus, as prices for risky corporate bonds will tend to be below similar default-
free government bonds, the yields on corporate bonds tend to be larger than those
on similar government bonds. The dierence between the yield on corporate and
government debt is called a default spread or sometimes a credit spread.
62
In general, pricing corporate bonds uses the same PV techniques that we have used
in pricing government debt. However, in the corporate bond case the cashows are
risky and must be estimated. Moreover, due to the risky cashows, we need to use
a risk-adjusted discount rate rather than the default-free spot rate. We will discuss
such risk-adjusted spot rates when we move on to equities in the next few chapters.
4.4 Conclusion
This chapter has provided a brief introduction to bond markets and bond pricing. We
have made extensive use of the PV techniques introduced in Lecture 1 to determine
how bonds should be valued. We have also explored the relationship between dierent
bond prices. This foundation will be built upon in later courses that explore bond
portfolio management and more advanced bond pricing topics.
63
64
CHAPTER 5
Introduction to Equities and Risk
In this chapter we rst present some basic facts related to equity securities and equity
markets before going on to discuss choice under uncertainty.
5.1 Equity markets: basic facts and features
Denition, equity security: an equity security is an ownership stake
in a rm which generally entitles its owner to a single vote on corporate
matters (e.g. in the election of the board of directors) and a share in the
distribution of dividends.
The key features of equities are the following;
Limited liability: in the event of bankruptcy, equityholders lose at most their
initial investment.
Residual claimants: equityholders have the lowest priority claim on the rms
assets.
65
5.1.1 Variations on the simple equity security
Non-voting shares: some equity shares do not give the holder any voting rights.
As such, they are generally less valuable than voting shares on the same rm.
Preferred stock: preferred stock generally do not give the holder any voting rights.
However, they tend to promise xed dividends every period, unlike ordinary shares. In
any period where a rm cannot pay the promised dividend to preferred stockholders,
the missed dividends accumulate and are paid in full as soon as possible. Thus,
given the xed nature of their dividends and lack of voting rights, preferred stock has
bond-like features.
Convertible Preferred stock: preferred stock that can be exchanged for ordinary
stock under certain conditions.
5.1.2 Stock Market listings
The managers of a corporation may choose to have their stock listed on a recognised
stock exchange. In order to list, the exchange typically requires that the rm must be
relatively large and have signicant trading interest amongst members of the public.
Once listed, stocks are eligible for trade on the exchange in question.
Some global trading venues, along with their sizes are given below;
The London Stock Exchange: total market cap 1.84 tn.
Euronext: e2.3tn.
NYSE, ASE, Nasdaq; $13.3tn., $0.5tn., $3.9tn.
Tokyo Stock Exchange: Y=522tn.
5.1.3 Equity indices
Individuals most commonly see equity market information through the movements
of various exchange-level or country-level equity market indices. These indices are
66
weighted averages of the prices of a given set of stocks and thus give a guide as to
movements in that set as a whole.
Examples of indices are;
US: Dow Jones Industrial Average, S&P-500, Nasdaq-100
Europe; FTSE-100, FTSE-250, SMI, CAC-40, DAX-30, MIB.
Far East; Nikkei-250, Topix, Hang Seng, Kospi, Straits Times Index
5.1.4 Equity Market Movements: 01/01/1995 - present
0 500 1000 1500 2000 2500 3000 3500
0
0.5
1
1.5
2
2.5
3
3.5
FTSE!100
S&P!500
Nikkei!225
5.1.5 Summary: key features of equities
From our graphs and our understanding of the basic structure of equity securities we
can say the following;
67
Risk: equities are clearly much more risky than bonds;
Their cashows are dividends, which are discretionary and variable (unlike
most bond interest payments).
Stocks are the lowest priority claim on rm assets.
Statistically, the variability of stock price indices is way larger than that
on bond price indices.
Thus, to understand stock valuation, we will need to begin to explore how investors
respond to risk.
5.2 Choice, uncertainty and risk
We now progress to give a theoretical foundation for the manner in which individuals
respond to risk. We look at the following areas;
Foundations of choice under certainty.
Foundations of choice under uncertainty.
Characterizing and measuring preferences towards risk.
Stochastic dominance.
5.3 Choice under certainty
Here we present the standard foundations of choice theory found in economics in
order to provide a basis for understanding how individuals will react when faced with
a menu of choices, each of which has uncertain future outcomes.
To begin, lets briey discuss choice under certainty. We presume that agents in our
world are rational to the extent that they possess preferences that have the following
properties.
68
5.3.1 Axioms of choice under certainty
Consider an individual faced with the choice between two alternatives a and b. Denote
the situation where a is strictly preferred to b with a b, a is weakly preferred to
b with a b and indierence between a and b with a b. Then we require the
following to hold;
1. Completeness: it must be true that either a b or b a or both are true
(such that a b). All were saying here is that any two alternatives can be
ranked or judged as identical.
2. Transitivity: if for any three alternatives we have a b and b c then it
must be true that a c.
The preceding two conditions seem pretty sensible. We should be able to rank any
two items on a menu in terms of how much we like them. Transitivity rules out cases
where we might get endless cycles in our decision making.
5.3.2 Preferences and utility functions
These conditions, along with a couple of smaller more technical ones, deliver some-
thing very powerful.
Existence of a utility function: if we have a preference ordering that
satises the preceding conditions then there exists a time-invariant, contin-
uous real-valued utility function, u, such that for any two items;
a b u(a) u(b)
The preference relationship can be viewed as machine which, when you feed it with
2 alternatives, tells you which is better (or sometimes tells you that theyre equally
good).
69
The utility function maps the choices you put in to points on the real line. To then
compare two choices, you take the 2 numbers that the utility function spits out and
compare them the best choice is the one that generates the higher number.
This is all very well when ones comparing the benet of eating an apple versus that
from eating an orange but it ignores the salient feature of nancial assets the future
value/payo of a nancial asset is not certain when you are making your investment
choice. Thankfully, we can generalise our prior analysis to cover uncertain outcomes.
5.4 Choice under uncertainty
Consider an investor who has 100 to invest today in either;
United Utilities, a relatively stable British utility rm
Partygaming, a UK-listed internet gaming rm
He intends to cash in his investment in one months time and use the proceeds to
buy food and shelter.
Thus, whats important to our individual is the cash his investment will generate one
month down the line, but this cash amount is uncertain both shares could go up
or down in value in the month ahead.
How does our investor decide which rm to invest in given the uncertainty over their
payos?
We will generalise our analysis of the choice under certainty case to cover lotteries,
where a lottery is dened as a set of possible future payos each of which is associated
with a specic probability of occurrence.
We can view both of the stock investments of the preceding paragraph as lotteries;
Paying the 100 is essentially buying a lottery ticket that gives us the oppor-
tunity to win one of a set of prizes.
Those prizes are the Sterling value of the stock in a months time and each of
the prizes has a probability of occurring.
70
Note: this requires us to be able to describe all possible outcomes for the future stock
price and associate each with a probability. That set of probabilities is assumed to
be objective in the sense that everyone agrees on the probability values.
For simplicity well focus on lotteries with 2 possible outcomes (in cash terms) x and
y and where the probabilities of x and y are and 1 respectively. Well use this
notation for that lottery, (x, y; ) where its understood that, as the probability of x
is , the probability of y must be 1 .
5.4.1 Axioms of choice under uncertainty:
1. A lottery with a probability one outcome is the same as receiving that prize for
certain: (x, y; 1) x
2. Lottery ordering is irrelevant; (x, y; ) (y, x; 1 )
3. Completeness; as before, either lottery 1 is weakly preferred to lottery 2 or the
converse or both.
4. Transitivity; if lottery 1 is preferred to lottery 2 and lottery 2 to lottery 3 then
lottery 1 is preferred to lottery 3.
5. Reexivity; (x, y; ) (x, y; ) for all lotteries.
6. If one is indierent about 2 outcomes then one will be indierent regarding
lotteries over those 2 outcomes; if x y then (x, z; ) (y, z; ).
7. Preferences are continuous.
5.4.2 The Expected Utility Theorem
If the preceding axioms hold over the space of lotteries then there exists a continous,
real-valued utility function U that ranks lotteries the same as our preference ordering
i.e.
(x, y; ) (w, z; ) U ((x, y; )) U ((w, z; ))
and the function U satises the expected utility property;
71
U((x, y; )) = U(x) + (1 )U(y)
where U() is the utility of money function which we require to be increasing in its
argument. Thus the utility of the lottery is the probability weighted utility of the
individual cash outcomes.
Again, our expected utility formulation is useful in that it maps each lottery into a
number on the real line. Lotteries can then be compared via these numbers.
For example, our hypothetical investor, having written down the payos of each of
the investments in each state of nature and given the probability of each state, could
use the expected utility function generated by his preferences to make his investment
decision.
Utility functions such as those weve derived above as often called Von-Neuman-
Morgenstern (VNM) utility functions in honour of the guys who rst wrote down the
preceding analysis. Note that all that matters to individuals in this setting are nal
cash outcomes from the lottery and associated probabilities.
Finally, note that ane transforms of VNM utility functions are also VNM utility
functions.
5.5 Preferences towards risk
Given the structure of a VNM expected utility formulation, we can now begin to
evaluate how individuals respond to risk. We will characterise individuals preferences
towards risk by placing restrictions on the utility of money function (U).
First we require U to be increasing this guarantees that individuals always prefer
more cash to less;
U

(W) > 0 W
Now we place restrictions on other features of U() to classify preferences towards
risk.
72
5.5.1 Risk-aversion
We dene a risk-averse individual to be someone who prefers a certain level of wealth,
say W, to a lottery which one would on average yield exactly W.
To formalise, assume were investigating an individuals preferences regarding 2 pos-
sible lotteries.
The individual gets W with probability one
The individual receives W +h with probability 0.5 and W h with probability
0.5
Note that the expected value of the second lottery is exactly equal to the certain
payo from the rst i.e. if we dene

W to be the random payo from the second
lottery we have;
E(

W) = 0.5 (W +h) + 0.5 (W h) = W
A risk-averter prefers lottery 1 over lottery 2 for all values of h i.e.
U(W) > 0.5 U(W +h) + 0.5 U(W h)
As Figure 5.1 demonstrates, this can only hold if the risk-averse agents utility func-
tion is globally concave. Global concavity requires the following;
U

(W) < 0 W
5.5.2 Risk-lovers
A second class of agent we might imagine are risk-loving in that they prefer a lottery
with expected payo W to a certain payo of W.
Clearly, the mirror image of the preceding arguments then tells us that these agents
must have utility functions that are globally convex (i.e. U

(W) > 0).


73
5.5.3 Risk-neutrality
Finally we can dene risk-neutral individuals as those who are indierent between a
lottery with expected payo W and a certain payo of W. In our two lottery case
this implies that;
U(W) = 0.5 U(W +h) + 0.5 U(W h)
for all h. This can only be true if the utility of money function, U() is linear such
that U

(W) = 0 for all values of W.


Figure 5.1: Risk-averse and risk-neutral utility functions
U
t
i
l
i
t
y
Risk averse
Risk!neutral
5.5.4 Real people
Now, in line with common sense and the results of many survey studies in experi-
mental psychology, in nance theory we generally assume that individuals are risk-
averse.
Most of us, for example, would choose a certain wealth level of 1,000,000 over a
74
lottery, governed by the toss of a fair coin, in which one received nothing if heads
and 2,000,000 if tails.
For us, the key implication of risk-aversion is as follows;
A risk-averter only prefers a lottery over a certain amount of cash if the lottery
has expected payo larger than W.
To induce the risk-averter to take the risky bet, you have to compensate him
in terms of expected payo.
5.6 Measuring risk-aversion
Question: How should we measure an individuals degree of risk-aversion?
Naive answer: risk-aversion is generated by concavity of utility functions and the
degree of concavity can be measured by U

(W). Use this to measure risk-aversion.


However: we have also already seen that the preferences of an individual are
invariant to ane transforms of utility functions. Thus a new utility function,
V (W) = a+bU(W) would be a valid representation of identical preferences to U(W)
itself. However, we can also see that V

(W) = bU

(W) the second derivatives


of the two utility functions are not identical even though the underlying preferences
are identical. Thus, the second derivative alone cannot be a decent measure of risk-
aversion.
5.6.1 The Coecient of Absolute Risk Aversion
However, two popular measures of risk are based on transforms of the second deriva-
tive of the utility function. The coecient of absolute risk aversion (h
A
) is dened
as follows;
h
A
=
U

(W)
U

(W)
(5.1)
75
Note that this measure is invariant under ane transforms for the utility function
but, in line with our naive intuition, its still increasing in the value of the second
derivative.
Now, if we wished to compare two utility functions to see which generates more
risk-aversion, we can compare their coecients of absolute risk-aversion.
It should be noted that in general h
A
will depend on W, the level of wealth.
5.6.2 The Coecient of Relative Risk Aversion
A second widely used risk-aversion measure is the coecient of relative risk-aversion
(h
R
) dened as below;
h
R
= W
U

(W)
U

(W)
(5.2)
Again, this measure is invariant to ane utility transforms and increases with the
magnitude of the second derivative of utility. Again, we can compare degrees of
risk-aversion by comparing value of the coecient of relative risk-aversion.
5.6.3 Absolute versus relative risk aversion
Absolute risk-aversion: an investor is oered a choice between certain wealth of
W and a lottery in which he receives W + h with probability and W h with
probability 1 . The value of that makes an investor indierent between the two
alternatives is linear in the coecient of absolute risk aversion
Relative risk-aversion: consider the choice between certain wealth of W and a
lottery in which he receives W(1+h) with probability and W(1h) with probability
1. The value of that makes an investor indierent between the two alternatives
is linear in the coecient of relative risk aversion.
Thus relative risk aversion in useful is situations where one faces a proportional gain
or loss while absolute risk aversion is most useful where gains or losses are absolute.
76
5.7 A selection of widely-used utility functions
Here are a few of the utility functions that youll nd spread throughout the nance
literature.
1. Linear utility: as previously discussed, with positive a and b, generates risk-
neutral preferences;
U(W) = a +b W (5.3)
2. Negative exponential utility: displays constant absolute risk aversion for
all levels of W. Is dened as follows;
U(W) = a be
cW
, h
A
= c (5.4)
where a, b and c are all positive.
3. Power utility: displays constant relative risk aversion for all wealth levels. Its
functional form is;
U(W) =
W
(1)
1
, h
R
= (5.5)
where > 0. Note that as tends to unity the power utility function tends to
the log utility function i.e.
U(W) = ln(W) , h
R
= 1 (5.6)
4. Quadratic utility: utility has a linear wealth term and a squared wealth term
with negative coecient (to guarantee concavity) i.e.
U(W) = W bW
2
(5.7)
where b > 0. For this function there are no clean forms for the coecients of
constant and absolute risk aversion.
77
5.8 Stochastic dominance
Finally for this chapter well briey introduce two further mechanisms for discrimi-
nating between lotteries. These concepts rely only on limited information regarding
preferences. To do this were going to need a little more notation.
Consider two lotteries, A and B, each which have outcomes dened on the entire real
line. The cumulative distribution functions (CDFs) at value x of these two lotteries
are denoted F
A
(x) and F
B
(x). What do the cumulative distribution functions tell
you? Well, by denition we have;
F
A
(x
0
) = Pr(x x
0
)
i.e. the CDF tells one the total probability that the outcome of the lottery is an
amount at or below x
0
. Using this notation alone we can dene our rst concept
5.8.1 First-order stochastic dominance
Lottery A is said to rst-order stochastic dominate lottery B if the following condition
holds;
F
A
(x) F
B
(x) x
Thus, for every payo level, lottery A has a smaller probability of yielding an amount
at or below this level. Flipping this round, for every payo level lottery A has a greater
probability of returning an amount above this specied payo level. Graphically, a
case such as this is shown in Figure 5.2
Furthermore, it can be shown that if lottery A rst-order stochastic dominates lottery
B then the expected utility of A exceeds that of B for any non-decreasing utility
function (i.e. for all U where U

(W) > 0). The problem with rst-order dominance


is that ones rarely in a situation this clear cut.
78
Figure 5.2: First-order stochastic dominance
!3 !2 !1 0 1 2 3
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Lottery B
Lottery A
5.8.2 Second-order stochastic dominance
Lottery A is said to second-order stochastic dominate lottery B if the following con-
dition holds;
_
x

[F
B
(z) F
A
(z)] dz 0
A scenario where A dominates B is shown in Figure 5.3.
We can show that if lottery A second order stochastic dominates lottery B then A is
preferred to B by all expected utility maximisers with increasing and concaved utility
functions.
5.8.3 Mean-preserving spread
Second order stochastic dominance is closely related to the notion of a mean-preserving
spread (MPS). A MPS of a lottery is a change in a lotterys payos such that the
79
Figure 5.3: Second-order stochastic dominance
!3 !2 !1 0 1 2 3
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Lottery A
Lottery B
mean is unchanged but the variance is increased. Such a situation is shown in Figure
5.4 lottery B is a MPS of lottery A.
Intuitively, any risk-averter will prefer A to B as the lotteries have the same mean
payos but B has more variability in payo (i.e. its more risky). Its straightforward
to show that the following is true;
Result: Consider 2 lotteries with payos dened on the entire real line and identical
mean payos. Then the following 2 statements are identical;
Lottery A second order stochastic dominates lottery B
Lottery B is a mean-preserving spread of lottery A
80
Figure 5.4: Mean-preserving spreads
!5 !4 !3 !2 !1 0 1 2 3 4 5
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Lottery A
Lottery B
81
82
CHAPTER 6
Portfolio Theory
We now use the technology developed in our presentation of choice under uncertainty
to describe how investors allocate their funds to risky investments. Our presentation
is based on the seminal contribution of Markowitz (1952).
We look at a static problem in which an investor must choose how to allocate
his wealth between N risky assets.
The investor has information regarding the probability distributions of security
returns.
The investor is assumed to be risk-averse, with utility increasing in expected
portfolio return and decreasing in portfolio return variance.
However, before diving into this analysis, we present some basic information on the
properties of stock index and single stock returns.
83
6.1 Statistical facts regarding stock returns
First, let us make clear how we compute returns on securities. At time 0 our investor
has wealth X which he invests in a stock.
The stock price at 0 is P
0
and thus he buys x =
X
P
0
units of stock.
At time 1, the stock pays a dividend of D
1
and the ex-dividend price is P
1
The return that the investor makes on his investment is;
r =
1
X
[x(P
1
+D
1
) X] =
D
1
+ (P
1
P
0
)
P
0
Thus, returns contain a component due to dividend payments and a component due
to capital gains.
6.1.1 Stocks versus bonds
If we take annual data on US stock and bond returns from 1926-2003, we can compute
the mean return and return standard deviation for each. We get;
Security Mean return Return s.d.
Small stocks 12.7% 33.3%
Large stocks 10.4% 20.4%
Long-term govt. bonds 5.4% 9.4%
T-bills 3.7% 3.1%
Ination 3.0% 4.3%
Key result: higher risk securities tend to have higher mean returns (where risk is
measured by return standard deviation).
84
6.1.2 Single stocks versus stock indices
Using more recent data, from 1968 to 1998, mean returns and return standard devi-
ations for some US stocks and the S&P-500 are as follows;
Security Mean return Return s.d.
Boeing 13.6% 43.6%
Coca-cola 14.0% 26.7%
Disney 18.4% 40.4%
GM 5.6% 28.4%
S&P-500 10.0% 15.7%
Key result: broad-based portfolios tend to have much smaller return standard
deviation than single stocks, but expected returns on the broad portfolios are similar
to those on single stocks. This is the benet of diversication.
6.1.3 Stock return correlations
Below we present the correlations for the single stock return data from the previous
slide;
Boeing Coca-cola Disney GM
Boeing 1
Coca-cola 0.30 1
Disney 0.11 0.64 1
GM 0.32 0.36 0.39 1
Key results:
Stock returns tend to be positively correlated.
Correlations can be fairly large.
85
6.1.4 Summary
Thus, we have the following set of results from our simple statistical analysis;
Higher risk stocks tend to have higher mean returns.
Single stocks are much more risky than stock indices, although their mean
returns are not too dierent diversication
Stock return correlations tend to be positive and can be large.
Now, keep these results in mind as we go on to present the standard mean-variance
model for portfolio selection.
6.2 The Basic Mean-Variance Problem
We assume an investor who has access to N risky assets (think of then as stocks)
where N is at least 2. Our notation is as follows;
: the N 1 vector of expected security returns
: the N N covariance matrix for the security returns
W
0
: the investors initial wealth in Dollars
: vector of portfolio weights the investor chooses (which must sum to 1)
We assume that all elements of and are nite and is non-singular. The investor
can short sell assets to an unlimited degree.
Our task: to characterize the investors optimal choice of , given the values of ,
and W
0
.
86
6.2.1 What are portfolio weights?
Denote by X
i
, the Dollar investment made in stock i. Then, the portfolio weight for
stock i is just the fraction of wealth invested in asset i;

i
=
X
i
W
0
If the investor allocates all of his wealth to the N stocks, then we must have;
W
0
= X
1
+X
2
+.... +X
N
and this of course implies that;
1 =
1
+
2
+... +
N
6.2.2 Portfolio Characteristics
For the time-being, lets take the portfolio weights vector, as given. What are the
expected return and return variance of the investors portfolio?

p
=

(6.1)

p
=

(6.2)
Rewriting these formulae in terms of the individual elements of and gives;

p
=
N

j=1

j

j
,
p
=
N

i=1
N

j=1

i,j
Thus;
The expected portfolio return is just the weighted average of the individual
mean returns, using the portfolio weights to form the weighted average.
87
Similarly the portfolio return variance is a weighted sum of the variances and
covariances of the individual asset returns.
As we showed previously, a restriction on the investors problem is that the portfolio
weights must sum to unity and thus, if we denote an N-vector of ones by

1, we have;

1 = 1
N

j=1

j
= 1
Note again that, as we allow short selling, each individual
j
can take any positive
or negative value, as long as when you sum them they total 1.
6.2.3 Preferences
We assume that our investor has preferences which lead to an expected utility function
of the following form;
U =
p

2
P
(6.3)
Thus expected utility increases as;
The expected portfolio return increase
Portfolio return variance falls
Justication: we can derive an expected utility function that looks like this if we
assume either that the utility of money function is quadratic or security returns are
distributed normally.
6.3 Frontier portfolios
Question: for a given level of expected portfolio return, what is the minimum return
variance one can achieve in ones allocation?
88
Clearly any investor who dislikes return variance should choose a portfolio with the
smallest possible variance for a given expected return and so we can exclude from
consideration any portfolios that arent in the set we derive.
The problem we wish to solve is;
min
1
2

subject to

= ,

1 = 1
where is the level of expected return weve chosen to focus on. This is a simple
constrained optimization problem which we solve using the Lagrangian method.
The Lagrangian function is as follows;
L =
1
2

+(

) +
_
1

1
_
Note that weve added a factor of 0.5 to the variance term. The rst-order conditions
of this problem are;
dL
d
=

1 =

0 (6.4)
dL
d
=

= 0 (6.5)
dL
d
= 1

1 = 0 (6.6)
where

0 denotes a column vector of zeroes.
Rearranging (6.4) in terms of the portfolio weight vector gives;
=
1
+
1

1 (6.7)
We need to get rid of the lagrange multipliers ( and ) from this equation. We do
this as follows. First pre-multiply (6.7) by

and use (6.5) to give;


=

1
+

1
Then pre-multiply (6.7) by

1

and use (6.6) to give;


89
1 =

1
1
+

1
1

1
We now have 2 equations which we can solve for the two Lagrange multipliers. We
get the following solutions;
=
C A
D
, =
B A
D
where;
A =

1

B =

C =

1

1
D = BC A
2
Note that, as is a covariance matrix, both B and C are guaranteed to be positive
and its easy to show that D is positive also.
Finally, substituting for and in 6.7 we get;

=
0
+
1
(6.8)
where

0
=
1
D
_
B
1

1 A
1

_
,
1
=
1
D
_
C
1
A
1

1
_
This expression for the optimal portfolio weights has a particularly simple form. It
implies that;
The optimal portfolio weights are linear in the desired expected return.
If you want an expected return of zero, portfolio weights should be
0
If you want an expected return of unity, your weights should be
0
+
1
.
90
6.3.1 Example
To give a more concrete view of what our analysis implies for an investors choice
set, here is a numerical example with three of the US stocks we encountered earlier:
Boeing, Coca-Cola and Disney. The expected returns and return standard deviations
for these assets are given below;
= [0.136, 0.140, 0.184]

= [0.436, 0.267, 0.404]

Their return correlation matrix is set at;


_
_
_
_
_
1.00 0.30 0.11
0.30 1.00 0.64
0.11 0.64 1.00
_
_
_
_
_
To get the covariance matrix, , we just need to multiply the correlation matrix
through by the relevant combination of standard deviations. Based on these numbers
we get the following values for our parameters from above;
A = 2.20
B = 0.33
C = 15.59
D = 0.31
From these numbers, we can derive the following values for
0
and
1
;

0
= [0.12 , 4.05 , 3.17]

,
1
= [0.54 , 23.32 , 22.78]

Using our derived values of


0
and
1
and equation (6.8) we can derive the appropriate
weights for a minimum variance portfolio with any desired expected return. For
example, if we wished to construct a minimum variance portfolio with expected return
91
10% (or in decimal terms 0.1) our weights would be;
w =
0
+
1
0.1 = [0.17 , 1.72 , 0.89]

Thus, 17% of our wealth should be invested in Boeing, 172% in Coca-Cola and we
should short Disney to the tune of 89% of wealth. Using the weights and we can
calculate the return standard deviation of the portfolio to be 41%.
6.4 The Portfolio Frontier
We now vary the target expected return for our portfolio and compute the portfolio
return standard deviation implied by the optimal weights.
In the following gure we have done this for expected returns between 2% and 30%
and plotted target mean returns on the y-axis against portfolio return standard de-
viations in the x-axis.
The curve that these pairs trace out is called the minimum variance frontier or
sometimes just the portfolio frontier.
From the Figure, we can directly read o the smallest portfolio standard deviation
available for a given expected portfolio return;
For example, for a portfolio with expected return 20%, the minimum return
standard deviation is approximately 18.5%.
Note that points in the region of the plot to the right of the curve represent portfolios
that can be formed of the 3 assets were looking at. However, these portfolios are not
minimum variance and will thus not be chosen by individuals with mean-variance
preferences.
Finally, we can also see that there is a specic expected return, and thus set of
portfolio weights, that delivers the smallest return standard deviation of all - we call
this the minimum variance portfolio. The minimum variance portfolio, labelled M
on the plot, appears to have return standard deviation at around 25% and expected
return around 14%.
92
Self-assessment exercise: can you alter the constrained optimisation problem we
solved earlier on the explicitly derive the weights for the minimum variance portfolio,
M?
Figure 6.1: The minimum variance frontier
0.2 0.4 0.6 0.8 1 1.2 1.4
0.05
0.1
0.15
0.2
0.25
0.3
M
6.4.1 Diversication
The most important point to note regarding this Figure is that two of our three
original assets lie to the right of the frontier. They are not minimum variance.
By combining our three assets we can form more attractive portfolios, in terms of
expected return and standard deviation, than the individual assets.
This is called diversication - by holding multiple assets in your portfolio (i.e. making
it more diverse) you tend to reduce the portfolio return volatility for any given level
of expected return.
We saw this result previously when studying the statistical properties of single stock
returns versus stock index returns.
Statistically, this eect is due to the fact that our asset returns are not perfectly cor-
related as we combine them we eliminate some of the portfolios return variability.
93
6.5 The Ecient Set
Any investor with mean-variance preferences will hold a portfolio that lies on the
frontier. Clearly, our investor will never choose a point on the frontier that lies below
(and to the right of) the minimum variance portfolio, M. This is because our investor
can always nd a portfolio on the frontier but above M which has the same return
standard deviation but higher expected return.
For example, assume our investor wants a portfolio with 20% return standard devi-
ation then there are two choices available. The rst has expected return around 5%
and lies below M while the second lies above M and has expected return around 20%.
Obviously the investor will choose the latter due to its higher expected return.
Result: investors will only ever hold portfolios on the frontier that lie above the
point M - this subset of the frontier is called the ecient set.
6.6 Two-fund separation
A very nice feature of the mean-variance frontier is that the entire set of frontier
portfolios can be derived from any two frontier portfolios. This is easy to see from
equation (6.8) due to its linearity in .
Assume that we know the frontier portfolios with expected returns
1
and
2
. Call the
weight vectors corresponding to these expected returns
1
and
2
respectively. Now
suppose we want to derive the weights for a portfolio with new expected return
3
.
Obviously, as long as
1
is not equal to
2
, we can express
3
as a linear combination
of
1
and
2
i.e.

3
=
1
+ (1 )
2
for appropriate choice of .
Now we need to derive the portfolio weights. Via equation (6.8), we know that;

3
=
0
+
1

3
94
Substituting our linear combination of
1
and
2
for
3
in the above we get;

3
=
0
+
1
(
1
+ (1 )
2
)
Obviously, we can write
0
=
0
+ (1 )
0
and so;

3
=
0
+ (1 )
0
+
1
(
1
+ (1 )
2
)
= (
0
+
1

1
) + (1 ) (
0
+
1

2
)
=
1
+ (1 )
2
Thus, to get the portfolio weights for an expected return of
3
, all we have to do is
form a linear combination of
1
and
2
where the coecient in the combination is
that which allows us to derive
3
from
1
and
2
. Using this technique we can trace
out the entire frontier.
This result is known as two-fund separation. It tells us that all frontier portfolios can
be derived from any pair of frontier portfolios. Via this result, if an investor wishes to
invest in a specic frontier portfolio, as long as he knows two points on the frontier,
he can get to the point he cares about by combining the two portfolios (or funds)
with the appropriate coecient.
6.7 A Risk-less Asset
Finally, we evaluate how the introduction of a risk-free security alters our analysis.
Thus, we add such a security (think of it as a bond issued by an entirely credit-worthy
government if you wish)
Features of the risk-free asset;
It delivers a guaranteed return of r
f
Its return variance is thus zero.
The covariance of its return with any other security return is also zero.
95
If we allow our investor to place some money in the risk-free asset then our optimiza-
tion problem is altered and can be written down as below;
min
1
2

subject to

+ (1

1)r
f
=
This problem diers to our previous formulation. First of all, weve lost the require-
ment that the sum of the elements of be one. This is because were investing in
the risk-free asset also. Thus, in the expression for the expected portfolio return we
see an extra term on the left-hand side for our investment in the risk-free asset.
The Lagrangian corresponding to this problem is;
L =

+
_

(1

1)r
f
_
Our rst order conditions are now;
dL
d
= +

1r
f
=

0 (6.9)
dL
d
=

(1

1)r
f
= 0 (6.10)
Again, lets rearrange equation (6.9) in terms of portfolio weights to yield;
=
1

1r
f
_
Last of all, we need to solve for . We substitute the expression weve just derived
for into the left-hand side of equation (6.10). This gives an equation in and the
model parameters , and r
f
. Solving for we get;
=
r
f
B r
f
(2 A r
f
C)
=
r
f
Z
where A, B, and C are as dened in the previous problem. Note that Z is a scalar such
that our solution for is just a multiple of the excess return on the risky portfolio
over the risk-free rate.
96
Substituting for in the expression for the optimal portfolio weight we get;

=
1
_

1r
f
_
r
f
Z
(6.11)
This gives the vector of optimal investments in the risky securities. The optimal
investment in the risk-free asset is given by 1

1.
Figure 6.2: The minimum variance frontier with a risk-free asset
0.2 0.4 0.6 0.8 1 1.2 1.4
0.05
0.1
0.15
0.2
0.25
0.3
T
6.7.1 Discussion
Figure 6.2 uses our result in equation (6.11) to plot the frontier in the case where we
have a risk-free asset. For reference, we also include on the Figure the frontier we
derived in the case where a risk-free asset was not available. The original frontier is
shown with a dashed line and the new frontier is plotted with a solid line.
There are 3 key features of the frontier now.
It is a straight line.
It cuts the vertical axis at exactly the risk-free rate
It is tangent to the original frontier at point T
97
6.7.2 Interpretation
First an aside on the eect of combining the risk-free asset with a risky portfolio.
Consider an arbitrary portfolio of risk assets, Q, held by an investor. Now introduce
a risk-free asset. What happens if the investor takes his original portfolio and starts
to mix it with the risk-free asset?
If we denote its expected return and return standard deviation on Q with
Q
and

Q
respectively then a mixture with weight on the risk-free asset has the following
properties;
= r
f
+ (1 )
Q
= (1 )
Q
Now, if we start at at unity (i.e. we start with an investment solely in the risk-free
asset) and gradually decrease it, both the expected return and standard deviation
increase but, interestingly, in expected return - standard deviation space the curve
traced out is a straight line, starting at r
f
and passing through Q. We can see this
by noting that;
d
d
=
d /d
d /d
=

Q
r
f

Q
This is the slope of the curve that represents all combinations of Q and the risk-free
asset and, as the slope is constant, we can see that the curve that is traced out must
be a straight line.
What weve shown here is that the set of portfolios generated by combining any par-
ticular risky portfolio with the risk-free security traces out a straight line in expected
return - standard deviation space.
Now an investor with mean-variance preferences, in the absence of a risk-free security,
will always choose a portfolio in the ecient set. If we introduce a risk-free security,
the ecient set is altered. The ecient set is exactly the straight line that joins the
point representing the risk-free rate with the tangency portfolio, T.
Moreover, any investor can get to any point on this line by forming the appropriate
98
combination of the risk-free security with the tangency portfolio, T this is two-fund
separation again. We know from the analysis weve just performed that combining
any given risky portfolio with the risk-free asset traces out a straight line as one varies
the weight on the risk-free security.
The Ecient set in the presence of a risk-free asset: when a risk-free
asset is available, the ecient set is the straight line that passes through
the point representing the risk-free rate and is tangent to the risky portfolio
frontier. Any point on this line can be formed by appropriately combining
the risk-free security with the tangency portfolio, T.
6.8 Conclusion
In this chapter we have used some simple mathematics, combined with weak restric-
tions on investors preferences to narrow down the set of portfolios an investor might
choose when faced with N risky assets and, later, with a single risk-less asset. In both
cases, i.e. with and without the risk-free security, we derived a representation for the
members of the ecient set of portfolios; those being the portfolios with the smallest
return variance for an specied level of expected return. Moreover, we demonstrated
the result known as two-fund separation. This result states that every portfolio in
the ecient set can be formed as a linear combination of any two distinct ecient
portfolios.
In the next chapter, we will use the foundations we have built here to derive an
equilibrium asset pricing model the famous Capital Asset Pricing Model.
99
100
CHAPTER 7
The CAPM
We now employ the analysis developed in the previous chapter to derive and evaluate
the Capital Asset Pricing Model (CAPM), originally developed in the mid 1960s by
Lintner, Mossin and Sharpe.
The CAPM is a model based on equilibrium in securities markets. To derive the
CAPM relationship we impose the condition that the demand for securities must
be equal to their supply. Combining this condition with our mean-variance analysis
allows us to derive a mathematical equation which links the expected return on any
security to its level of risk.
Central message: in a world populated by investors who like greater expected
returns but dislike greater risk, more risky assets have larger expected returns in
equilibrium.
7.1 The CAPM derivation
We start with the expression for the optimal risky portfolio in a mean-variance world
(with a risk-free asset). We previously called this portfolio the tangency portfolio and
101
labelled it T;

T
=
1
_

1r
f
_

T
r
f
Z
(7.1)
Now, we use this equation to derive an expression for the covariance of the returns
on any portfolio of risky assets (Q), with the tangency portfolio.
Denoting the expected return on that portfolio by
Q
we have;
Cov(r
T
, r
Q
) = w

T
w
Q
=

T
r
f
Z
_

r
f

1
w
Q
=

T
r
f
Z
(
Q
r
f
)
where the nal equality follows from the denition of
Q
and the fact that

1

w
Q
= 1.
Similarly, the variance of of the return on the tangency portfolio is;
Var(r
T
) = w

T
w
T
=

T
r
f
Z
_

r
f

1
w
T
=
(
T
r
f
)
2
Z
Now combine these two formulae in order to eliminate the coecient Z;

T
r
f
Cov(r
T
, r
Q
)
(
Q
r
f
) =
(
T
r
f
)
2
Var(r
T
)
(7.2)
Cancelling the common term of (
T
r
f
) on both sides and multiplying each side by
the covariance between r
T
and r
Q
we get;
(
Q
r
f
) =
Cov(r
T
, r
Q
)
Var(r
T
)
(
T
r
f
) (7.3)
Interpretation: the excess return on any portfolio (Q), is directly proportional to
102
the excess return on the tangency portfolio. The coecient of proportionality is given
by the ratio of the covariance of the returns on Q and the tangency portfolio to the
variance of the return on the tangency portfolio.
7.1.1 Some further assumptions
Now we make some assumptions regarding the population of investors in our world.
All investors agree on expected returns, return standard deviations and thus
the shape of the mean-variance frontier.
All investors agree on the level of the risk-free rate
Note that weve already made several assumptions in our mean-variance analysis (e.g.
innite divisibility of assets, unconstrained short sales, ...).
Altogether, these assumptions imply that all investors do their mean-variance math-
ematics and come up with the same graph of risk-return opportunities.
Investors will then all choose (possibly dierent) portfolios lying on the straight line
traced out by the ecient frontier.
7.1.2 Equilibrium
We can now introduce our notion of equilibrium.
Demand: as all investors locate on the straight line connecting r
f
and T, all investors
hold risky assets in the proportions dictated by the tangency portfolio. Thus, the
tangency portfolio can be thought of as representing the demand side of the market.
Supply: the portfolio of all available assets is the supply side of the market.
Equilibrium: demand = supply tells us that the tangency portfolio must be identical
to the market portfolio of risky assets.
It is fairly common to think of the market portfolio as being approximated by a
wide-ranging, stock index such as the S&P-500 in the US or the FT All Share in the
UK. These are both market-cap weighted portfolios of many stocks.
103
7.1.3 The CAPM equation
Denoting the return on the market portfolio by r
M
and substituting this for r
T
in
equation (7.3) we get the following;
E(r
Q
) r
f
=
Q
[E(r
M
) r
f
] (7.4)
where we used the fact that
i
is just the expected return on portfolio i and weve
dened;

Q
=
Cov(r
M
, r
Q
)
Var(r
M
)
This is the famous CAPM equation. It is the result of simple mean-variance analysis,
assumptions of homogeneity of information and beliefs across investors and market
equilibrium.
7.2 Understanding the CAPM equation
Interpretation of the CAPM equation is straightforward. Below I have rewritten the
equation with a couple of extra labels;
E(r
Q
) r
f
. .
Excess stock returns
=
Q
..
Risk
[E(r
m
) r
f
]
. .
Excess market returns
(7.5)
The risk of any security in a CAPM world is measured by its and the expected
return on an asset is directly proportional to . Thus, in line with intuition, risk
averse investors must receive a higher expected return from holding high risk (high
) assets than they receive from low risk assets in simple terms, if you dont like
risk you must be paid to bear it.
7.2.1 and Risk
But why does represent risk? Well, it is dened as follows;
104

Q
=
Cov(r
M
, r
Q
)
Var(r
M
)
Take an investor currently holding the market portfolio. He is considering adding a
little more of asset Q to his portfolio. How should he judge Q?
If Cov(r
M
, r
Q
) is positive then adding Q to his portfolio will increase the vari-
ability of his portfolio returns i.e. it will increase risk.
If Cov(r
M
, r
Q
) is negative then adding Q to his portfolio would remove some of
its return variability i.e. lowering risk.
Thus, the larger the , the higher an assets risk
7.2.2 More stu about
In a CAPM world is the appropriate measure of a stocks risk.
The market portfolio must have a of 1.
Assets a with below one are defensive assets defensive in the sense that
they are relatively low risk. For such assets, when the market moves up (down)
by a percentage point, they tend to move up (down) by less than a percentage
point.
Assets with greater than one are sometimes called aggressive assets and tend
to move up (down) by more than a percentage point when the market moves
up (down) by a percentage point.
Note: the CAPM implies that investors are only rewarded for bearing risk that is
market-related.
7.2.3 Systematic risk and idiosyncratic risk
The CAPM implies that one can write the return on any asset as follows;
105
r
Q
= r
f
+
Q
[r
M
r
f
] +
Q
where
Q
is an error term that must be mean zero and uncorrelated with r
M
. Taking
the variance of LHS and RHS of the preceding equation yields;

2
Q
=
2
Q

2
M
+ Var(
Q
)
The rst term on the RHS represents systematic risk risk that is market-related,
cannot be eliminated via diversication and thus earns expected returns. The second
term on the RHS is idiosyncratic risk this risk is unrelated to the market, can be
diversied away and does not earn expected return.
7.2.4 The Security Market Line
The risk-return tradeo implied by the CAPM is shown in Figure 7.1. Here we
plot the Security Market Line (SML), the straight line relationship between and
expected returns implied by the CAPM equation.
The slope of the SML must be equal to the expected excess return on the market
portfolio and the vertical intercept of the line must be the risk-free rate.
Key result: all securities must lie on the SML.
The expected return on all assets is solely determined by and as increases the
expected return rises with a slope coecient equal to the excess expected return on
the market portfolio.
7.2.5 Uses of the CAPM
What does the CAPM deliver that we can usefully employ in other situations/
1. Computing risk-adjusted discount rates for use in present value calculations.
2. Stock valuation: think of a stock as a claim to a stream of future dividends.
We can value the stock by computing the present value of these dividends. Use
the CAPM to get the risk-adjusted discount rate.
106
Figure 7.1: The Security Market Line
1
0
Beta
E
x
p
e
c
t
e
d

a
s
s
e
t

r
e
t
u
r
n
r
f
E(r
m
)
3. Portfolio selection;
(a) In a pure CAPM world portfolio selection is easy: hold the market portfolio
plus some of the risk-free asset.
(b) If were in a world where the CAPM holds for most, but not all, stocks. If
you nd a stock which has expected return greater than that implied by
the CAPM, up-weight it in your portfolio relative to the market portfolio
weight.
7.3 Testing the CAPM
The CAPM equation can be used to generate a number of testable implications. We
can use these implications to evaluate whether the model appears to hold in real
world stock and bond return data. To begin, lets repeat the CAPM equation;
E(r
Q
) r
f
=
Q
[E(r
M
) r
f
]
107
Some direct implications of this equation are as follows;
1. The only factor that should aect the expected excess return on a security is
its . Taking a cross-section of stocks, the variation in their expected excess
returns is generated solely by variation in their s.
2. If we were to estimate the relationship between a cross-section of excess stock
returns and their s then the slope coecient in the relationship should be
equal to the excess return on the market portfolio.
3. In the cross-sectional regression relationship between excess stocks returns and
s, the intercept coecient should be zero.
We now need to build a statistical framework in which we can perform these tests.
Problem: the key variable on the right-hand side of the equation above, , is not
directly observable s must be estimated from historical data on stock returns
and market returns.
7.3.1 Estimating
is dened to be the ratio of the covariance between stock and market returns to the
variance of the market return. Note that this expression is identical to the formula for
the slope coecient in a least squares regression of stock returns on market returns.
Denoting market returns at time t with r
M,t
and the returns on stock Q at the same
time with r
Q,t
, stock Qs can be estimated as the slope coecient from the following
time-series regression equation;
r
Q,t
= + r
M,t
+
t
(7.6)
where t runs from 1 to T, is an intercept coecient and
t
is a regression error.
The expression for the least squares estimate of the slope coecient is given by;

=
Cov(r
M,t
, r
Q,t
)
Var(r
M,t
)
which is precisely what we want.
108
Thus, s for individual assets can be estimated via stock-by-stock time-series regres-
sions of stock returns on market returns.
7.3.2 Cross-sectional tests
Assume that we have return data on N dierent stocks, indexed by i, as well as
market returns plus each stocks .
We can test the implications of the CAPM using a cross-sectional regression of the
following form;
r
i,t
r
f
= +
i
+
i
(7.7)
where
i
is a regression error term, r
i,t
r
f
is the average excess return on stock i over
our sample period, is a regression intercept and is a regression slope coecient.
This cross-sectional regression is just a statistical analogue of the Security Market
Line we plotted earlier.
The slope of the SML is E(r
M
) r
f
and the intercept is r
f
. If we estimate equation
(7.7), the value of should be close to zero and should be approximately the mean
excess market return. Via OLS, we can test these hypotheses.
Finally, as mentioned above, the CAPM implies that only should help us explain
the cross-section of expected stock returns. Thus, if we extend our previous cross-
sectional equation to include any extra variable on the right-hand side, that variable
should have no impact.
More concretely, if we denote this extra variable with X
i
then we can write down our
extended regression as follows;
r
i,t
r
f
= +
i
+X
i
+
i
(7.8)
The CAPM tells us that the value of should be very close to zero and this should
hold for any denition of the variable X
i
.
109
7.3.3 Problems and extensions
While in principle, the previous two subsections provide us with the means to conduct
a test of the CAPM, there remain some problems of implementation.
Econometric problem 1: the s were using are estimates and are thus measured
with error. This results in a tendency for estimated values of to be biased towards
zero. This must be considered when interpreting results of our cross-sectional tests.
Econometric problem 2: the error term in the time-series regression (7.6) is un-
likely to be normally distributed. A common feature of stock return data is that
their distributions tend to have excess kurtosis (fat-tails) and positive skew. Such
features will lead to incorrect statistical inference being drawn from the time-series
regressions used to estimate s.
7.3.4 Noise in stock returns and portfolio formation
Another problem of implementation comes from the fact that individual stock returns
are so noisy. As individual stock return volatilities are frequently larger than 50%
per year, it is very dicult to statistically distinguish the mean return on stocks
from one another. The standard way to alleviate this problem is to group stocks into
portfolios where ones portfolio selection methodology attempts to separate big and
low return stocks into dierent portfolios.
A common way to do this is to group stocks by or alternatively by market cap
e.g. to form 10 portfolios from ones universe of stocks where each stock is assigned
to a portfolio based on comparison of its mean market cap with the deciles of the
cross-sectional mean market cap distribution.
Note also that using this portfolio approach tends to reduce the measurement error
that is encountered when estimating s and thus alleviates our rst problem.
7.3.5 The Roll Critique
Finally, we should mention the Roll critique of tests of the CAPM. This critique is
based on the fact that the true market portfolio is unobservable.
110
Tests of the CAPM are necessarily based on imperfect proxies for the market
portfolio rather than the true market portfolio.
If one has chosen a proxy which, ex-post, lies on the ecient frontier, then by
construction a linear relationship between stock returns and s will arise the
market might be inecient though.
If one picks an inecient proxy, then one will not nd a linear relationship
between returns and s even though the market might be ecient.
Based on these arguments, the opinion of some is that our testing methodology tells
us little, if anything, about the validity of the CAPM.
7.3.6 Empirical evidence
Tests of the CAPM based on empirical frameworks similar to ours tend to provide
some evidence that stock returns are positively related to but also throw up a
number of issues that cast doubt on the validity of the CAPM.
A classic result in this area that presents problems for the CAPM is that portfolios
of small stocks tend to lie way above the empirical SML. Banz (1981), for example,
nds a strong relationship between US stocks market cap and their returns after
having accounted for .
Cochrane (2001) presents evidence on the t of the CAPM to US stock data. Using
10 market cap based stock portfolios plus two bond portfolios he veries that the
mean returns on these 12 portfolios are positively related to . However, the stock
portfolios tend to lie above the ex-post SML, and the smallest stock portfolio way
above the line.
Finally, evidence contained in Fama and French (1993) demonstrates problems with
the CAPM based on size and, additionally, book-to-market for a cross-section of US
stocks.
First of all they perform a bivariate sort of their stock universe using quintiles of
the distributions of size and book-to-market, thus generating 25 portfolios. These 25
portfolios then become the basic assets under analysis.
111
Again using a time-series and then cross-section approach, Fama and French nd
that the standard CAPM explains virtually none of the cross-sectional variation in
portfolio returns.
We will return to this evidence in the next chapter but, for the time-being it is
sucient to say that the results of Fama and French (1993) point in the direction
of models for the cross-section of stock returns that support multiple factors, rather
than the single factor CAPM approach.
7.4 Conclusions
In this chapter we have developed the theory behind the ever-popular Capital Asset
Pricing model. We began with an exposition of mean-variance analysis before impos-
ing market equilibrium and arriving at the familiar linear representation for excess
stock returns.
We then moved on to describing simple empirical tests of this model and reviewing
empirical evidence. Unfortunately, our review of applied work in this area has led us
to the conclusion that the CAPM does not tell the entire story in terms of describing
the cross-section of stock returns.
s appear to have positive explanatory power for portfolios of stocks in some ap-
plications but other studies point towards more complex expected return generating
mechanisms where factors other than market s matter. It is to such multi-factor
models that we turn in the next chapter.
112
CHAPTER 8
Mulltifactor models and the APT
We now develop an alternative approach to determining expected returns on risky
assets - the Arbitrage Pricing Theory.
This model is built on the notion of absence of arbitrage, which can be derived
from relatively weak restrictions on preferences.
However, we will need fairly strong assumptions regarding the data generating
process for stock returns - we assume that stock returns behave according to a
factor model.
In what follows we will rst elaborate on the notion of absence of arbitrage and the
preference restrictions required to ensure it holds. Then we will introduce our factor
model for return generation. Finally we will put these two ingredients together to
derive the APT.
113
8.1 Absence of arbitrage: review
Consider a situation where one is presented with an investment choice between two
portfolios, A and B.
The portfolios deliver identical future returns in every possible state of nature.
For some reason, A costs less than B.
What would your investment decision be?
Naive answer: portfolio A is more attractive than B as it delivers identical future
prospects but at lower cost. Thus buy A.
But we can do better than this .....
8.1.1 Building the arbitrage
Lets do the following instead;
One simultaneously buys A and (short) sells B.
Then one makes some money today, as buying A costs less than the cash inow
that shorting B generates.
In the future, as the portfolios have identical payos, the net exposure from
buying A and shorting B is zero in every state of nature.
Result: we have created a composite portfolio that creates a cash inow today and
is associated with no future risks. Clearly, every smart investor will want to invest in
this portfolio on a massive scale, generating large immediate payos and no future
exposures.
We have constructed an arbitrage portfolio.
114
8.1.2 Absence of arbitrage
In a well functioning nancial market such arbitrage opportunities should not exist.
Greedy investors should run these arbitrage strategies on a massive scale and the act
of running the strategy should eliminate the arbitrage opportunity.
To see why, consider the eect of running the arbitrage strategy on the prices of A
and B.
The arbitrage creates buying pressure for A, which will increase As price.
The arbitrage creates selling pressure for B, which will reduce Bs price.
Thus, with the price of A rising and B falling the arbitrage opportunity shrinks and
eventually will disappear.
8.1.3 Preference restrictions
What we have described above is the basic reasoning behind the notion of absence
of arbitrage in nancial markets.
Note that we have said nothing regarding investors other than the fact that they
are greedy. The only preference restriction required in order to generate absence of
arbitrage is that investors have utility functions that slope upwards.
We need make no assumption regarding investors attitudes to risk.
8.1.4 Formal denition of arbitrage portfolios
We can describe two distinct types of arbitrage;
Portfolios with negative investment costs (i.e. where proceeds of sales exceed
costs of purchases) and zero future payos in all states of nature.
Portfolios with zero investment costs and positive future payos in some states
of nature.
115
Pricing using absence of arbitrage is widely used in both academic and practical
nance, especially in the pricing of derivative assets.
Assume that we are trying to price a new asset, Z. Our rst step is to create a
portfolio of existing assets that has identical payos to Z in every state of nature
the replicating portfolio. Then, to ensure no arbitrage opportunity exists, the price
of Z must be exactly the same as the cost of the replicating portfolio.
8.2 Factor models for returns
In the absence of any strong restrictions on preferences, in order to progress towards
our goal of deriving a new asset pricing model, we need some further structure. That
structure will be placed upon the data generating mechanism for the returns on risky
assets.
Our model is going to require that the return on any risky asset is generated by a
factor model as shown below;
r
i
= a
i
+
K

j=1
b
i,j
I
j
+e
i
, E(e
i
) = 0 , Var(e
i
) =
2
i
, Cov(I
j
, I
k
) = 0 j , k (8.1)
8.2.1 Restrictions on the factor model
The factor model obeys the following conditions;
The factor model says that stock returns are a linear combination of K common
inuences, the I
j
variables. These factors aect all stocks but dierent stocks
may have dierent sensitivities via the b
i,j
coecients.
The factors are assumed to be uncorrelated with each other.
Each stock has its own constant term in the return equation, a
i
.
Each stock is inuenced by its own mean-zero random return term, e
i
, and
these are uncorrelated across stocks.
The random terms are all uncorrelated with each of the common factors.
116
E(e
i
e
k
) = 0 i = k , E
_
e
i
_
I
j


I
j
__
= 0 i, j
8.2.2 Interpretation
Key feature: all covariation in stock returns is driven by the common factors (as
the noise terms are uncorrelated across stocks). Hence we view the model as one in
which there are K common sources of risk. Additionally, each stock has a source of
idiosyncratic risk (the e
i
terms).
To demonstrate, lets explicitly calculate the return covariance for stocks 1 and 2,
assuming that there are 3 common factors;
Cov(r
1
, r
2
) = Cov(a
1
+
3

j=1
b
1,j
I
j
+e
1
, a
2
+
3

j=1
b
2,j
I
j
+e
2
)
= Cov(
3

j=1
b
1,j
I
j
,
3

j=1
b
2,j
I
j
)
where the a terms drop out because they are constants and the es disappear because
they are uncorrelated with everything in the model. Finally, as the common factors
are all uncorrelated with one another we get;
Cov(r
1
, r
2
) =
3

j=1
b
1,j
b
2,j
Var(I
j
)
Thus, it is only the common factors that contribute to stock return covariances.
Each of the K common sources of risk aects the covariance between returns, but
the stock-specic risk terms (the e
i
terms) do not.
8.3 The APT: a simple derivation
Start with a two factor model for a set of N diversied portfolios as below;
r
i,t
= a
i
+b
i,1
I
1,t
+b
i,2
I
2,t
(8.2)
117
Note: the assumption that asset i is a diversied portfolio means that there is no
specic risk for this asset.
8.3.1 Factor models and portfolio characteristics
Consider forming an arbitrary portfolio, P, of our securities where the portfolio
weights are denoted X
i
. The factor loadings of this portfolio are;
b
P,1
=
N

i=1
X
i
b
i,1
b
P,2
=
N

i=1
X
i
b
i,2
N

i=1
X
i
= 1
8.3.2 A two-factor, three asset example
Consider the following three diversied portfolios which exist in a 2-factor world;
Asset a
i
b
1,i
b
2,i
A 0.05 3 1
B 0.01 0 2
C 0.00 -1 1
The assumption that these portfolios are well-diversied allows us to assume that
their residual risks are zero at all times. Thus, for example, asset A has a return-
generating mmodel that looks as follows;
r
A
= 0.05 + 3 I
1
+I
2
8.3.3 Factor replicating portfolios
We begin by identifying a replicating portfolio for each factor and a replicating
portfolio for the risk-free asset.
118
The factor replicating portfolio for factor 1, is a portfolio with a loading of 1 on factor
1 and 0 on factor 2. Similarly, the factor replicating portfolio for factor 2 has loading
zero on factor 1 and loading 1 on factor 2. Thus these portfolios have the properties;
Portfolio b
i,1
b
i,2
Mean return
FRP 1 1 0 r
1
FRP 2 0 1 r
2
Risk-free 0 0 r
f
8.3.4 Example: deriving replicating portfolio weights
Deriving these weights is straightforward it just involves solving sets of simultaneous
equations. For example, to derive the weights for the risk-free portfolio we must solve
the following set of simultaneous equations;
3X
1
+ 0X
2
1X
3
= 0
1X
1
+ 2X
2
+ 1X
3
= 0
X
1
+X
2
+X
3
= 1
where the nal equation is just an adding up condition for the portfolio weights.
Solving these equations gives X
1
= 0.5, X
2
= 1 and X
3
= 1.5.
Doing the same to derive factor-replicating portfolios for factors 1 and 2 we get;
Portfolio X
1
X
2
X
3
r
Risk-free 0.5 -1 1.5 0.015
Factor 1 0.75 -1 1.25 0.0275
Factor 2 0.25 0 0.75 0.0125
The nal column of the preceding table gives the expected returns on the three
portfolios formed as the appropriate weighted average of the a
i
s. Thus, the risk-free
rate in our model is 1.50%. The excess expected returns on the two pure factors
portfolios are 1.25% (i.e. 2.75% less 1.50%) and -0.25% respectively.
119
8.3.5 Replication
Consider: an asset A with factor loadings b
A,1
and b
A,2
Technique: were going to form a portfolio of the factor-replicating portfolios and
the risk-free asset which has identical factor loadings to A
Absence of arbitrage: with identical loadings and no specic risk, the expected
return on A must be identical to the expected return on the portfolio weve con-
structed.
8.3.6 Calculating the replicating weights
If X
1
is the weight on FRP 1 and X
2
is the weight on FRP 2 and X
3
is the weight
on the risk free asset then we must have;
b
A,1
= X
1
1 +X
2
0 +X
3
0 = X
1
b
A,2
= X
1
0 +X
2
1 +X
3
0 = X
2
1 = X
1
+X
2
+X
3
Hence, given the values for X
1
and X
2
we have;
X
3
= 1 b
A,1
b
A,2
8.3.7 Finally ......
The expected return on the replicating portfolio is equal to the sum of the weights
times the expected returns on the components i.e.
X
1
r
1
+X
2
r
2
+X
3
r
f
= b
A,1
r
1
+b
A,2
r
2
+ (1 b
A,1
b
A,2
)r
f
120
By absence of arbitrage this must be equal to the expected return on A. Thus;
r
A
= b
A,1
r
1
+b
A,2
r
2
+ (1 b
A,1
b
A,2
)r
f
= r
f
+b
A,1
( r
1
r
f
) +b
A,2
( r
2
r
f
)
Expected returns on any asset can be calculated from expected returns on the factor
replicating portfolios and the risk-free rate.
8.3.8 The APT equation
The preceding equation is the key APT pricing equation and thus Ill repeat it;
r
A
= r
f
+b
A,1
( r
1
r
f
) +b
A,2
( r
2
r
f
) (8.3)
To calculate the expected return on any asset we need to know;
Its exposures to the factors
The risk-free rate
The expected returns on the factor replicating portfolios
We use these expected returns in the same way one would use expected returns from
the CAPM (i.e. for valuation, risk-adjusted discounting or portfolio selection).
8.3.9 Example: the APT equation
In our example, the APT equation for a single portfolio can be written as follows;
r
i
= r
f
+ ( r
1
r
f
) b
1,i
+ ( r
2
r
f
) b
2,i
= 0.015 + 0.0125 b
1,i
0.0025 b
2,i
All assets in this world must have expected returns and factor exposures that satisfy
this APT equation. Thus, if we were to introduce a new asset, called D, with expo-
121
sures of 2 and 4 to the rst and second factor respectively, then its expected return
would have to be;
r
D
= 0.015 + 0.0125 2 0.0025 4 = 0.03 = 3%
8.4 The Arbitrage Pricing Theory: formal deriva-
tion
We now present a more mathematical derivation of the APT equation, based on our
assumptions of greedy investors and stock returns following a factor model.
We again assume a set of N assets exist (all of which obey our factor structure for
returns) and form a zero-investment portfolio from these assets. If we denote the
vector of investment in the N assets by X, with individual elements X
i
, then the
zero investment condition is;
X

1 =
N

i=1
X
i
= 0 (8.4)
We then require our portfolio to have zero exposure to all of the common return
factors i.e. we want a portfolio with no systematic risk. This can be written as;
X


b
j
=
N

i=1
X
i
b
i,j
= 0 , j (8.5)
where

b
j
is the vector formed from the individual assets exposures to the jth common
factor. Note that to do this we need N to be big relative to K. With N less than K
one would not be able to force all factor exposures to zero simultaneously.
Finally, the portfolio must have residual risk of (approximately) zero. If N is relatively
large (e.g. greater than 50), then any balanced portfolio of stocks will automatically
have very small residual risk. This condition can be written as;
X

e =
N

i=1
X
i
e
i
0 (8.6)
122
where e is the vector formed from the stock-level error terms.
Absence of arbitrage: equations (8.4) to (8.6) imply that we have built a portfolio
that costs nothing and which has no exposure to any risk, either factor risk or residual
risk. Absence of arbitrage thus requires that the portfolio earns a zero expected
return. Thus, from equations (8.4) to (8.6) and absence of arbitrage we can conclude
that;
X

r =
N

i=1
X
i
r
i
= 0 (8.7)
where r
i
is the expected return for asset i and r is the vector of expected returns.
Finally, we can use a bit of linear algebra to arrive at the APT equation. A math-
ematician would describe equations (8.4) and (8.5) as indicating that the vector of
portfolio weights is orthogonal to a vector of ones and also orthogonal to all K of the
vectors of factor exposures.
A standard theorem from linear algebra states that if a set of orthogonality conditions
imply another orthogonality condition, then the vector that is the subject of the
implied orthogonality condition can be formed as an exact linear combination of the
vectors involved in the other conditions. In our setting this means that expected
returns can be written as an exact linear combination of a vector of ones and the
vectors of factor exposures i.e.
r =
0

1 +
1

b
1
+
2

b
2
+.... +
K

b
K
(8.8)
where
j
is a constant scalar for all j.
8.4.1 Interpretation
How should we interpret these s? Well rst, lets assume that a portfolio/asset exists
with zero exposures to all the common factors. This portfolio is clearly risk-free and
so;
r
f
= r
f
=
0
123
Next, consider a portfolio with zero exposure to all common factors other than the
jth and unit exposure to factor j. This is the factor replicating portfolio for factor j.
Denoting the expected return on the factor replicating portfolio by r
j
we have;
r
j
=
0
+
j
= r
f
+
j

j
= r
j
r
f
The preceding argument holds for all j so we can rewrite equation (8.8) as follows;
r = r
f

1 + ( r
1
r
f
)

b
1
+ ( r
2
r
f
)

b
2
+.... + ( r
K
r
f
)

b
K
(8.9)
The APT: Equation (8.9) is the key APT equation. It tells us that the
expected return on any asset or portfolio can be described as the sum of
the risk free rate and a linear combination of the asset in questions factor
exposures. The weights in that linear combination are given by the excess
expected returns of a set of pure factor replicating portfolios.
Note: the logic behind the derivation of equation (8.9) allows us to derive the APT
equation in a practical setting. We nd the factor-replicating portfolios and their
expected returns. Then we just plug the numbers into equation (8.9).
8.4.2 The CAPM and the APT
Note the similarities between the form of the APT in (8.9) and the CAPM equation.
If we assume only 1 common factor in the APT then the CAPM and APT equations
look very similar just substitute r
1
with r
M
.
However, note the dierences in the derivation of the CAPM and APT.
The former is an equilibrium model built on the foundations of mean-variance
analysis.
The latter is based on an assumed factor structure for returns along with an
application of absence of arbitrage.
124
One advantage of the APT over the CAPM is that, via absence of arbitrage, it should
be valid for any (suciently) large subset of risky assets. As we saw in the previous
chapter, testing the CAPM requires one to know the composition of the market
portfolio of assets.
8.5 Empirical tests of the APT
The key equations for the APT are (8.1) and (8.9). Empirical tests of the APT thus
focus on estimating these equations. We have repeated them below for clarity;
r
i
= a
i
+
K

j=1
b
i,j
I
j
+e
i
r
i
= r
f
+ ( r
1
r
f
) b
i,1
+ ( r
2
r
f
) b
i,2
+.... + ( r
K
r
f
) b
i,K
A variety of methods for estimating and testing the APT exist. Below we give a brief
overview of two techniques.
8.5.1 Economic or Characteristic-based factor models
We can use a 2 stage approach similar to that used testing the CAPM;
1. Use insights from economic theory to determine a set of macroeconomic or
nancial market related variables that will serve as factors (the I
j
s) in our
analysis e.g unexpected changes in ination or interest rates or the return on
a stock market index. This is clearly an important step the factors are the
foundations of the model and a bad choice of their identities here will lead to a
bad model.
2. Estimate equation (8.1) to yield the values of b
i,j
for all stocks and factors. This
completes the rst stage of the estimation.
Then we use the estimated factor sensitivities (the b
i,j
) to estimate equation (8.9).
Note that we have the same problem with this second stage regression as previously
with the CAPM the b
i,j
are measured with error.
125
Example: Chen, Roll, and Ross (1986): these authors select as factors variables
derived from ination, industrial production, the dierence between yields on high
quality and low quality bonds (credit spreads) and the term structure of interest
rates. They nd that these variables are signicant in explaining the cross-section of
expected US stock returns.
Alternative: others use returns on certain portfolios of stocks to represent the fac-
tors. Key references here are Fama and French (1992) and Fama and French (1993).
These authors use returns on the market, returns to a size portfolio (small minus
large rm returns) and returns to a portfolio constructed on the basis of stocks
book-to-market values (high BTM returns minus low BTM returns). They also in-
clude as factors a term structure variable (yields on long-term government bonds less
short-term interest rates) and a credit spread variable (yields on long-term corporate
bonds less returns on long term government bonds). Again, these factors are shown
to be signicant in the rst-stage regression and sensitivities to them have good
explanatory power for the cross-section of returns in the second stage regression.
8.5.2 Statistical Factor models
Another common way of specifying and testing the factor model that underlies the
APT is to use statistical factor analysis. This is a technique that, for a given value
of K, identies the K linear combinations of the individual stock return data that
explain as much of the variation in the returns as possible while minimizing the
covariation in the errors produced.
Thus, in this setting, the researcher does not rely on any economic intuition to de-
termine the identity of the factors, they are automatically delivered by the statistical
analysis. The approach simultaneously delivers the factors and the factor sensitivi-
ties.
Techniques exist that allow one to test whether the number of factors in ones sta-
tistical model is appropriate. If, for example, one has 5 factors in ones statistical
model then we can test whether the addition of a sixth factor explains a signicant
portion of the variation in residual returns.
One we have settled on the number of factors in the model then we can use the
estimated factor sensitivities as before in a second stage regression to estimate the
APT equation (8.9).
126
The classic example of this methodology is Roll and Ross (1980) who look at daily
US data for the 1960s and early 1970s. They nd that a 3 factor model does a good
job of explaining the cross-section of returns.
These results and those from the economic factor models tend to indicate that stock
return data is best supported by a model more complex (in terms of the number of
factors) than the CAPM.
8.6 Conclusion
The APT provides an alternative approach for describing the cross-section of expected
security returns.
It is based on assuming a factor model as ones return generating process and then
applying absence of arbitrage arguments. These ingredients yield a model for ex-
pected returns that is a sum of the risk free rate and a linear combination of the
asset-specic factor exposures.
Empirical results from application of the APT tend to indicate that a relatively small
number (5 or fewer) of economic or statistical factors do a good job of explaining
covariation in stock returns and explaining the cross-section of expected returns.
Empirical evidence for multi-factor models tend to erode researchers condence in
the CAPM as a good model for the explaining the cross-section of expected stock
returns.
127
128
CHAPTER 9
Market eciency
The CAPM and APT deliver formulae for the returns that an asset should yield due
to its risk characteristics. These returns are thus not free money but compensate the
investor holding the asset for bearing risk.
Once we understand the returns that securities are supposed to deliver, we can begin
to ask whether some assets appear to deliver greater returns than justied by their
risk. If such assets can be identied then a smart investor can form a portfolio that
generates excess returns i.e. the investor is being rewarded with returns above the
level justied by his portfolios risk.
Evaluating whether an investor can identify stocks that deliver returns over and
above those justied by risk is essentially examining whether security markets are
informationally ecient. A market is said to be to be informationally ecient if
investors cannot earn the type of free money described above.
9.1 Informational eciency: denition
The classic denition of eciency is that given by Jensen (1978). It is as follows;
129
Denition 1: a market is said to be informationally ecient with respect
to an information set , if an investor cannot make economic prots by
trading on the information contained in . By economic prots, we mean
risk-adjusted returns net of all costs.
Malkiel (1992) gives an alternative denition;
Denition 2: a capital market is said to be ecient if it fully and correctly
reects all relevant information in determining security prices. Formally,
the market is said to be ecient with respect to some information set, ,
if security prices would be unaected by revealing that information to all
participants. Moreover, eciency with respect to an information set, ,
implies that it is impossible to make economic prots by trading on the
basis of .
Note the common features in these denitions. Both argue that in an ecient market,
a specied set or piece of information cannot be used to make a portfolio allocation
decision that allows one to make positive trading prots.
Another way to view this from a more statistical perspective is that the information
in question cannot be used to forecast the economic prots available from nancial
securities.
A nal way to say this is that, in an ecient market, there should be no way to tell in
advance, which securities will deliver returns above those justied by their risk and
which will deliver returns below the level justied by their risk.
9.1.1 Fleshing out the denitions
However, both denitions leave certain things unspecied. These are listed below;
1. The information set (): details which pieces of information we are using
to try to forecast returns on securities. Obviously there are many possible
130
denitions of this set.
2. Risk-adjustment: as we indicated in the introduction to this chapter, only
if we know the returns that a portfolio should deliver due to its risk can we
discover if that portfolio generates free money of the sort discussed previously.
3. Trading costs: nally, to be able to say that a portfolio generates free money
we must account for the costs of forming and rebalancing that portfolio in terms
of trading costs (brokerage fees and commissions).
9.1.2 Information versus operational eciency
Finally, we should attempt to distinguish informational eciency that we are dis-
cussing from the alternative notion of operational eciency.
A market is operationally ecient if it is cheap, easy and quick to trade whichever
securities one wishes to, in whichever quantities one wishes to. Thus operational
eciency is really a question of how smoothly a market runs.
On the other hand, informational eciency is concerned with securities being correctly
priced given all available information.
Of course, there are likely to be linkages between the two forms of eciency. A
market which is more ecient from an operational point of view, for example, will
probably also be more ecient in an informational sense. The ease of trading that
operational ability brings means that prices will respond more quickly and accurately
to new information.
9.2 Information sets and varieties of eciency
Part of the denition of an ecient market was the set of information that we are
dening eciency relative to. Clearly dierent choices of information set will result in
one testing dierent avours of eciency. Roberts (1967) came up with the following
eciency classication;
1. Weak-form eciency: here the information set is dened to be the past
history of prices (and sometimes volumes) for a stock.
131
2. Semi-strong form eciency: the market for an asset is said to be semi-
string form ecient (SSFE) if all publicly available information is incorporated
in the assets price.
3. Strong form eciency: this is the most stringent denition. In a strong-
form ecient (SFE) market all information, either private or public, is fully
reected in stock prices.
Note that as we go down the preceding list, the information set in question is getting
wider and wider. The past history of prices and volumes for a stock is a subset
of all publicly available information and, obviously, the set of all publicly available
information is a subset of the set of all private and public information.
Thus, as the information sets widen, the eciency concept being tested is getting
stronger. It is clearly more demanding to assert that a certain stocks price fully
reects all private and public information than to suggest that the price reects only
that information contained in the historical price record.
In what follows we will review some of the testing procedures and results relevant to
all three of these eciency concepts.
9.3 Risk-adjustment and testing eciency
As indicated by the denition of Jensen (1978), for example, to assess whether a
stock market is ecient we need to understand whether genuine economic trading
prots can be made from trading on information.
To do this, we need to separate the returns from holding the stock that are due to
bearing the risk the stock carries from abnormal or excess returns we need an asset
pricing model.
Problem: proper accounting for expected returns requires us to use the
correct asset pricing model that generated expected returns in the data
we are analyzing. But we do not know the true asset pricing model that
generated our data. We can never be sure that we have settled on the
correct model.
132
This problem implies that all tests of market eciency are actually a test of a joint
hypothesis. In statistical language, when we conduct an eciency test, our null
hypothesis comprises two parts;
The market under analysis is informationally ecient.
We know the correct asset pricing model that generates expected returns in
this market.
The problem here is the uncertainty created when interpreting any test results;
A test indicates strong positive prots to trading on certain information. One
way to interpret this is that it leads us to reject the notion of eciency in this
market. A second interpretation is that the asset pricing model that we have
used in computing economic prots is incorrect.
Thus, the joint-hypothesis problem clouds our ability to interpret results from e-
ciency tests.
9.4 Eciency and random returns
Regardless of which variety of eciency we are testing, we can think of a market as
ecient with respect to as one in which future (risk-adjusted) returns appear to
be entirely random from the perspective of an individual who has the information
set given by .
This interpretation often confuses people if markets correctly and quickly absorb
all relevant information then it seems odd to assert that returns will be random.
However, a bit more thought shows the interpretation to be valid.
In an ecient market, as all currently known information has been absorbed into
prices, then all that can move prices is the arrival of new information. This new
information is by denition unpredictable (otherwise it would not be information at
all) so any price movement associated with it is unpredictable also i.e. future
returns are unpredictable or random.
133
Note also that eciency or random-ness of future (risk-adjusted) returns does not
imply that there is no relationship between stock prices and fundamental variables
like dividends or corporate earnings. Perhaps increased dividend payments lead to
increased stock prices, but in an ecient market stock prices adjust instantly to news
of increased dividend payments such that traders cannot prot from this news and
future returns are still impossible to forecast.
9.5 Eciency and statistics
Consider a random variable, X
t
, and a time t information set
t
. X
t
is said to be a
martingale if the following condition holds;
E(X
t+1
|
t
) = X
t
(9.1)
where we assume that
t
contains X
s
for s t.
Interpretation: ones best forecast of the value of the variable at t+1, given current
information, is just the current value. If we think of X
t+1
as representing the price of
a stock tomorrow and
t
information available to an investor as of today, then this
equation is saying that our best guess of tomorrows stock price is simply todays
closing stock price.
Note: the linkage between this interpretation and the discussion on the previous
slide. There we argued that in an ecient market future risk-adjusted returns (price
changes) should appear random to an investor. Here we are saying that if prices are
a martingale then ones best forecast of tomorrows price is just todays price i.e. a
return of zero.
Consider now a random variable y
t
. The variable y
t
is a fair game (also known as a
martingale dierence or MD) if;
E(y
t+1
|
t
) = 0 (9.2)
Given time t information, ones best guess of the t+1 value of a MD is zero. Obviously,
if X
t
is a martingale then y
t
= X
t
X
t1
is a martingale dierence.
Continuing our previous analogy, if X
t+1
is tomorrows closing stock price then y
t+1
134
is the change in price between todays and tomorrows close. Equation (9.2) says that
ones best guess of the change in price (or return) between one day and the next is
zero.
9.5.1 Abnormal returns and fair games
Denote the time t return on a given stock as r
t
. We can view this return as comprising
two elements;
r
t
= E
t1
(r
t
) + r
t
(9.3)
where E
t1
(r
t
) is the equilibrium/expected return at t, maybe generated by a CAPM
or APT model. Then r
t
is the abnormal return. The equilibrium return on a stock
reects the money it should deliver to cover its risk characteristics and hence (as we
said earlier) does not represent trading prot. True prot resides in the r
t
component.
The EMH requires that abnormal returns are a fair game;
E( r
t+1
|
t
) = 0 (9.4)
Note the linkage between asserting that abnormal returns are a fair game and the
discussion of the previous section. There we argued that in an ecient market future
risk-adjusted returns (price changes) should appear random to an investor. Equation
(9.4) tells us that in an ecient market ones best guess of abnormal returns on the
basis of
t
is exactly zero one cant forecast whether abnormal returns are likely
to be positive or negative. This implies that pure trading prots on the basis of
t
are on average zero.
9.6 Testing WFE: return predictability
Here we review three commonly used approaches used for evaluating whether securi-
ties markets are WFE. They are;
Autocorrelation analysis
Calendar eects
135
Technical trading rules
Each of these techniques attempts to use the patterns in past price/return data to
forecast the behaviour of future prices/returns.
9.6.1 Autocorrelation analysis
A tool commonly used by statisticians and econometricians to examine how a variable
is related to its own past values is known as an autocorrelation coecient. Such a
coecient tells us whether a systematic relationship exists between a variable and its
value k periods earlier. It is computed as follows;

k
=
Cov(r
t
, r
tk
)
Var(r
t
)
(9.5)
Thus, the autocorrelation is just a correlation coecient between values of the same
variable measured k time intervals apart. Of course, such a coecient can be con-
structed for dierent k. If the kth autocorrelation is positive (negative) we have a
situation where a positive return today tends to forecast positive (negative) returns
k periods ahead.
9.6.2 Autocorrelation evidence
Various authors have constructed such statistics for various securities from a number
of countries.
Fama (1965) found little evidence of autocorrelations that were reliably dierent
from zero for daily data on US stock returns.
Campbell, Lo, and Mackinlay (1996) provide recent evidence in this area using
US data from the 1960s through to the 1990s. They nd positive rst-order
autocorrelations for daily, weekly and monthly returns.
deBondt and Thaler (1985) demonstrate that if one measures returns over very
long periods, e.g. 5 years, then there is clear evidence of strong negative auto-
correlation.
136
Problem: most of these studies fail to risk-adjust returns. Thus perhaps we are not
nding evidence of true ineciency.
9.6.3 Calendar anomalies
One of the most heavily mined areas for return predictability studies is that which
seeks to suggest that returns are predictability higher or lower than usual at certain
times of the year, month or week (or even day).
The most famous of these eects is the January eect which suggests that stock
returns tend to be signicantly higher in January than in other months of the year.
See Fama (1991) for results and further references on the January eect in US, UK
and other stock return data.
Certain authors have also found evidence of a Monday eect in US return data
returns on Monday are signicantly lower than on other days of the week.
9.6.4 Explaining calendar anomalies ...
Calendar eects are puzzling as explanations of these results based on inadequate
risk-adjustment requires us to tell stories regarding calendar eects in the risk factors
that aect stock returns. This is not easy to do.
Other explanations focus on features of market institutions or trading arrangements
that cause certain parts of the week, month or year to dier from others (an example
of this is to appeal to the fact that the US tax year ends in December to explain
subsequent high returns in January).
Perhaps the most convincing explanation of the calendar anomalies found thus far
is that they arise by chance in nite samples of data and it is the intensive analysis
of these nite samples of stock return data that have uncovered them. Under this
interpretation, they do not really reect ineciency and will fail to generate positive
excess returns in future samples of data.
137
9.6.5 Technical trading rules
Technical and chart analysis are techniques widely used by FX and equities traders to
form trading rules. These techniques attempt to relate patterns observed in security
prices to subsequent returns. Some of the rules are based on statistical transforma-
tions of past prices and others are based on graphical representations. A couple of
popular examples are decsribed below;
1. Moving-average crossover rules: the trader computes two moving-averages
of past prices, one over a short horizon and one over a long horizon. A long
(short) signal is generated whenever the short moving-average cuts the long
moving-average from below (above).
2. Support/resistance rules: here the trader computes the maximum price
observed over recent times (the resistance level) and also the minimum price
observed over the same interval (the support level). These levels are thought
to dene a range within which the security will trade in the absence of any new
information. If the current price breaks out above this level then a long signal
is generated and if the current price dives below the resistance level a short
signal is generated.
3. Head and shoulder patterns: this is a graphical technique in use by some FX
and equity traders. A head and shoulders top pattern is a pattern observable in
recent prices where a local maximum has been reached and then retreated from
(the left shoulder), followed by a higher maximum being attained and retreated
from (the head). Finally a third local maximum is achieved, at a lower level
than the head (the right shoulder). When prices then drop below this third
local maximum the pattern is complete and a sell signal is generated. A head
and shoulders bottom pattern is the opposite of that just described and creates
a buy signal. See Figure 9.1 for an example head and shoulders.
9.6.6 Empirical evidence on technical trading rules
Until fairly recently, academic analysis of technical trading rules was thin on the
ground. However, in the late 1980s and 1990s a number of studies were conducted.
These include;
138
Figure 9.1: A head and shoulders top pattern
0 100 200 300 400 500 600
2
4
6
8
10
12
14
16
18
20
Sweeney (1988) and Levich and Thomas (1993) applied several common rules
to daily FX data, concluding that trading prots were available.
Brock, Lakonishok, and LeBaron (1992) applied moving average and support-
resistance rules to daily stock index data also concluding that they had the
power to forecast returns.
In all these cases, the returns from the trading rules outperformed the simple strategy
of just buying and holding the asset in question. However, some caution must be
required in interpreting these results. All fall prey to the problem that they do not
risk-adjust their trading rule returns.
9.7 Tests of SSFE: event studies
The event study is the key statistical tool for evaluating semi-strong form eciency of
securities markets. Recall that SSFE is concerned with whether all public information
is correctly reected in security prices. The event study allows the researcher to
139
evaluate whether, around public releases of certain pieces of corporate information,
abnormal returns are available to investors. The steps in conducting an event study
are as follows.
1. First of all, we must decide on exactly which type of event we will examine. We
might choose to look at earnings announcements or dividend announcements
or announcements of stock splits, for example. Let us assume that we choose
to examine announcements of stock splits.
2. Next we collect, for a specied universe of rms, information on the dates and
contents of each stock split announcement over a specied historical period (e.g.
we look 10 years into the past for the FTSE-100 companies).
3. Next we choose the window our event study will cover. For example, we might
look at how markets react from 10 days before to 50 days after each event such
that, including the event day itself, we consider a window of 61 days in total
around each event.
4. Finally, we need excess returns for each of these days for all events. Thus we
choose an asset pricing model (e.g. the CAPM) and for each stock and each
event date we must compute the abnormal return for each of the 61 days in the
event study window.
5. For a simple event study we then form averages of the excess returns on each of
the days in the event study window i.e. we compute an average of all returns
on days -10 relative to the event, then do the same for day -9, ...... thus we end
up with a mean excess return for each day in the event window. Such returns
can be plotted to give a visual representation of the outcome of the event study.
9.7.1 Interpreting an event study plot
Let us assume that the type of event we have chosen to focus on should lead to an
increase in prices if there is good news. Some potential shapes for the cumulative
excess return prole are shown in Figure 9.2.
A SSFE market: the solid line depicts what we might expect to see for the cumu-
lative excess return in an ecient market. There is no average price reaction before
the event. On the event date itself (i.e. when information is released) the price
140
Figure 9.2: Event study cumulative return proles
!10 0 10 20 30 40 50
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
Instant reaction
Over!reaction
Under!reaction
(or cumulative excess return) jumps upwards to a new level and stays at that level
throughout the post-event period. There are no prot opportunities here.
Inecient markets: two very dierent scenarios are plotted alongside the solid
line. In those cases the cumulative return response also jumps on the event date but
then, in the post-event window goes on to drift upwards or downwards. In both cases
excess returns are available.
Pre-announcement returns: note that in all of the cases above we have not
mentioned possible cumulative excess returns in the pre-announcement period. This
is because there are a number of potential explanations for positive excess returns
pre-announcement that do not rely on SSF ineciency.
For example, assume that we are studying the announcement of new equity issues.
Perhaps rms tend to issue new equity when they believe their existing shares are
over-valued, as then they likely get more than fair value for the new shares sold. In
this case one would expect to see new equity issues being preceded by positive excess
returns, not because of inecient markets in a SSF sense, but because of the way
managers of rms choose to time such events.
141
9.7.2 Results from event studies
The event study literature is wide and diverse. The table below contains a summary
of results for a few of the more well-known event-studies.
Study Event Results
Asquith (1983) Merger announcements Price of acquirer tends to
fall in post-event period
Asquith/Mullins (1983) Unexpected dividend increases Positive excess returns
Asquith/Mullins (1986) New equity issues Negative excess returns
Bernard/Thomas (1990) Earnings announcements Excess returns
In many cases, event study evidence provides some evidence that positive excess
returns, in excess of trading costs, may be available. Having said this, results that
indicate ineciency tend to garner more attention than those that tend to support
the ecient markets hypothesis.
There is a very large number of studies that fall into the latter category such that,
on balance, the evidence from event studies suggests that markets react quickly and
accurately to new information and, therefore, semi-strong eciency prevails. See, for
a summary, Fama (1991).
9.8 Tests of strong-form eciency
Most test of strong market eciency focus on whether one can nd evidence of
corporate insiders or investment professionals having superior information to the rest
of the market.
Trades of corporate insiders: several event studies test whether the trades of
company directors are correlated with subsequent share performance. The balance
of evidence suggests that directors tend to buy just before periods of positive excess
return and sell just before periods of negative excess returns. See Seyhun (1986) for
US and King and Roell (1988) for UK evidence.
Performance of mutual fund managers: one might expect the portfolios of active
fund managers to generate positive excess returns given their presumed expertise.
142
However, the evidence from this literature suggests the opposite. The average mutual
fund actually generates marginally negative excess returns. Of course, some managers
tend to outperform the market, but few (if any) can generate consistent, positive
excess returns.
Value of equity analyst recommendations: intuition would suggest that, if an-
alysts do uncover any useful information, their stock recommendations should enable
an investor to form portfolios yielding positive excess returns. The evidence here
does not tend to support strong-form eciency as portfolios built on analyst recom-
mendations are often shown to outperform. It would appear that equity analysts do
uncover useful information in their study of stocks fundamental characteristics. See
the results in Barber, Lehavy, McNichols, and Trueman (2001) for example.
9.9 Conclusion
In this chapter we have introduce the notion of market eciency, reviewed the tech-
niques used to study evaluate eciency and looked briey at results of some eciency
studies.
By and large, empirical evidence is fairly supportive of the notions of weak and semi-
strong form eciency, especially once the eects of data-snooping are accounted for.
Tests of strong form eciency, however, often point to violations of this eciency
notion. Thus it would appear that corporate insiders and investment profession-
als (aside from mutual nd managers) do possess superior information to normal
investors.
143
144
CHAPTER 10
Derivatives: instruments and pricing
In this chapter we introduce the various kinds of derivative contracts that are most
frequently traded in todays nancial markets. We then go on to discuss how these
assets are priced.
As in our APT work, our fundamental pricing approach will be to use absence of
arbitrage arguments. Finally we will discuss a few topics related to the management
of portfolios that contain derivatives.
Derivatives markets are among the fastest growing securities markets at present.
Thus, a clear understanding of the basic varieties of derivatives will be valuable to
all considering work in nance.
10.1 Basics
Lets start by dening some basic terminology and concepts. We start with a deni-
tion;
145
Denition: a derivative security is a security for which the payo is gov-
erned entirely by the value of one or more underlying assets.
Given the above, its obvious that these assets are called derivatives as their payos
are derived from the prices of other assets. However, other than specifying this
feature, the denition gives little useful information. To make things more concrete
we need to specify the underlying asset and to be able to deduce how the derivative
payo relates to the underlying price.
10.1.1 Vanilla derivatives
Below is a list of some of the most common, also known as vanilla, derivatives. In
our later analysis we will focus on the members of this list. Beside each item in the
list is a brief description of its payo prole;
Forward and futures contracts: the obligation to buy/sell the underlying
asset at a pre-specied price and on a pre-specied date.
Call options: the right to buy the underlying asset for a pre-specied price
and on (or before) a pre-specied date.
Put options: the right to sell the underlying asset for a pre-specied price
and on (or before) a pre-specied date.
Swaps: the exchange of cashows for a pre-specied period of time. The
cashows are determined by some preset rule and are based on the value of
some underlying assets.
More complicated derivative securities are often called exotics. They include more
complex option contracts (e.g. lookback options, mortgage-backed securities, swap-
tions).
The particular variety of derivative determines how the payo is related to the un-
derlying asset. Clearly, the other important object to specify is the identity of the
underlying. Amongst other things, one can create derivatives on;
146
Equities: single stocks (e.g. BP or MSFT) or equity indices (e.g. the FTSE-
100, S&P-500, Nikkei 225).
Currencies: exchange rates such as USD/EUR or JPY/GBP.
Bonds and Interest rates: for example, LIBOR, Bunds and Gilts.
Commodities: such as Gold, Oil and Coee.
Eectively, if one can observe the price of a security or the value of a particular indi-
cator then one can write a derivative on it. These days we trade things as interesting
as weather derivatives and energy derivatives.
10.1.2 Uses for Derivatives
Finally, it is useful at this point to ask exactly what nancial market participants use
derivatives for. Derivatives are used for a number of reasons with the most common
below;
Hedging: many market participants, for example corporate treasurers in oil
rms or exporting rms, use derivatives to remove or reduce the risks associated
with their economic activities or investment portfolios.
Making bets: one can use derivatives to deliberately gain exposure to certain
risks e.g. if you have a view on the future value of the FTSE-100, one could
exploit this using FTSE futures.
Create arbitrage portfolios: as derivatives are based on the prices of un-
derlying assets, one can sometimes construct portfolios of derivatives and the
underlying that yield risk-free arbitrage prots.
10.2 Forwards and futures contracts
Forwards are amongst the oldest derivative contracts, having between traded (partic-
ularly in the agricultural sector) for hundreds of years. Futures are really just slightly
more standardized forward contracts.
147
Denition: a forward contract is an agreement to buy/sell a certain quan-
tity of an asset for a pre-specied price at a particular future date.
Forwards tend to have the following characteristics;
They are bilaterally agreed contracts and are not exchange traded. thus they
are often described as over the counter or OTC.
At inception the forward price (F), i.e. the price at which the exchange is to
occur in the future, is set such that the value of the contract is zero and, thus,
no money changes hands on the inception date.
The individual contracted to purchase the asset is said to have a long position in the
forward while the individual selling the asset has a short position.
Long Payo: the individual holding the long-side has agreed to buy the specied
asset at price F at time T. Clearly, if the market price for the asset at T exceeds the
price he has promised to pay he has made a prot equal to the dierence between
the market price and the forward price. If at T, the market price is lower than the
forward price then the holder of the long side makes a loss equal to the dierence
between forward price and market price. In either case, the net gain, or payo, to
the long-side is given by;
Payo
l
= S
T
F
where S
T
is the market price of the underlying asset at the maturity date, T.
Short payo: the payo to the short-side of the contract is obviously the negative
of the payo of the long-side. If the long-side makes a prot of X, the individual who
is short must make a loss of X. Thus the payo to the short-side of the contract is
given by;
Payo
s
= F S
T
148
Figure 10.1: Long forward payo
6
-
Prot
F
S
T

10.2.1 Example: forward contract


A wheat farmer might arrange the following contract with a buyer (prior to harvest)
To supply 5000 bushels of wheat, 6 months from today (i.e. on harvest), at a price
of 3.25 per bushel (i.e. for a total consideration of 16,250).
On the delivery date, the market price of wheat turns out to be 3.50. Thus, the
farmer makes a loss to the tune of 0.25 per bushel, or 1250 in total.
10.3 Futures contracts
Forwards and futures are very similar assets in terms of payo. In both cases, the
total payo to the long side is equal to the dierence between the underlying price
at the delivery date and the agreed delivery price. However they dier in a couple of
key areas;
Trading location;
Forwards are bilateral agreements made between two parties.
149
Figure 10.2: Short forward payo
6
-
Prot
F
S
T
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
Futures are exchange traded, standardised contracts.
Exchange of monies;
Forwards: money changes hands only on the agreed delivery date.
Futures: money changes hands throughout the contract lifetime via a pro-
cess called marking to market.
Thus, while in total the payos to long and short sides of a futures contract are exactly
as shown in the prior Figures, the money is exchanged throughout the lifetime of the
contract rather than in one discrete lump on the delivery date. This marking to
market makes use of the following tools;
Margin account: an account holding monies deposited by the long party in
the futures contract.
Initial margin: the initial amount the buyer must deposit in the margin
account when the contract is opened.
Variation: as the market price of the futures contract changes the balance in
the margin account is altered accordingly.
150
Maintenance margin: if the balance in the account falls to a pre-specied
value, the buyer must top up the balance to the initial margin.
The margin account system and marking to market are overseen by the exchange
upon which the future is traded.
10.3.1 Marking to market: example
Consider a US investor who wishes to speculate. He believes that the price of gold
will rise over the coming 2 or 3 months. Rather than buying gold itself (and thus
having to deal with storage and security), he takes a long position in a gold futures
contract with the following specications;
Contract size: 100oz
Futures price: $300 per oz
Maturity: 90 days
Initial margin: $1500
Maintenance margin: $1000
Below we give a hypothetical path for the futures price over the contract lifetimeand
show how the margin account operates for our US speculator;
Futures Total Relative Margin Margin
Price Change Change Balance Call
300.00 1, 500 0
298.40 160 160 1, 340 0
294.60 540 380 960 + 540 (= 1, 500)
1, 500
292.80 720 180 1, 320 0
287.90 1210 490 830 + 670 (= 1, 500)
1, 500
291.20 880 +330 1, 830 0
151
At the opening of the contract, the speculator pays $300 per future and is required
to deposit $1500 in a margin account.
In one week the price of the future has dropped to $298.40. Given the 100oz contract
size, this makes a total loss of $160. The $160 loss is deducted from the speculators
margin account and transferred to the sellers margin account.
The following week, the market price of the future drops to $294.60. The total loss
associated with this price fall is another $380 which is also deducted from the balance
in the margin account and transferred to the account of the counter-party. However,
now the balance in the margin account is below the maintenance margin of $1000
thus the investor is required to top up the balance in the margin account to the
initial margin level of $1500. The rest of the example continues in similar fashion.
If we proceed to total up the gains and losses to the speculator over the life of the
contract, he gains $ 330 ( = $1,830 - $ 1,500) on the balance in the margin account
but hes lost $1210 ( = $ 540 + $ 670) in margin calls. At maturity he pays the nal
futures price of $291.20 per oz. Thus in total he pays 100$291.20+$1210$330 =
$30, 000.
A key advantage of marking to market is that, by spreading the $30,000 payment
over time, the problems associated with one of the counter-parties defaulting on their
obligations is reduced.
In our example, if our speculator was to go bankrupt on the day before the delivery
date then his counter-party would still have received some of the money promised to
him via his margin account. In the case of a simple forward contract, bankruptcy
would result in the seller losing all that was owed to him.
10.4 Options
Options are among the most liquid and fastest growing segments of asset markets.
In order to understand their usefulness and features lets start with the denition of
a standard option contract.
152
Denition: an option gives its owner the right, but not the obligation, to
buy or sell a given quantity of a specied asset at some particular future
date for a pre-determined price, known as the strike price or exercise price.
So, we have 5 fundamental things to specify in the contract terms; underlying asset,
quantity, maturity date, strike price and whether the owner can buy or sell. This list
will become longer as the option becomes more complex.
An individual who has purchased an option contract is said to be long the option
and an individual who has sold an option contract is said to have written the option.
We can fundamentally distinguish two types of option according to whether they give
the right to buy or sell the underlying asset;
Call option: gives the owner the right to buy the underlying asset.
Put option: gives the owner the right to sell the underlying asset.
Another fundamental distinction can be made on the basis of times at which the
holder can exercise his right to buy/sell;
European option: the owner can buy/sell on the maturity date only.
American option: the holder can choose to buy/sell at any date up to and
including the maturity date.
153
Option: example contract: on the 14/11/2005 the following contract
was being trade on Euronext-LIFFE;
A December 05 call option on Vodafone, with strike price 1.40, and giving
rights to 100 shares, was trading for 0.09. At the same time, the spot
price of Vodafone shares was 1.46.
Thus if one was to spend 9 (= 0.09 100);
Youd have the right to buy 100 shares of Vodafone.
You could exercise on any business day up to the maturity date (which
was the third Friday in December 2005).
If exercised youd pay 1.40 100 = 140 for 100 Vodafone shares.
10.4.1 Option payos
Consider a call option with strike price K and where at the current time the price of
the underlying is S. We have the following terminology.
The call is in the money if the underlying price exceeds the strike price (i.e.
S K > 0). Then the option can be exercised at a prot.
The call is at the money if the underlying price is the same as the spot price
(i.e. S K = 0).
The call is out of the money if the strike price is above the underlying price
(i.e. S K < 0) as then the holder can make no money through exercise and
will thus choose not to exercise.
We can provide the same denitions for a put option.
154
10.4.2 Long Call Payo
Based on the above we an characterise the payo to the investor with a long call
option position.
When the underlying price (S) exceeds the exercise price (K), this investor will
choose to exercise his right to buy the asset and will make a gain on exercise
of S K.
However, when the current underlying price is below the exercise price the
individual will choose not to exercise (as it would be cheaper to buy the asset
in the open market) such that his gain is zero.
Putting these two arguments together, the payo to the long position is;
Payo
l
= max[S K, 0]
The key feature of the preceding payo is its non-linearity. The fact that the holder
of the option is not obliged to purchase the asset means that he only does so when it
is protable and this creates a kink in the payo prole.
Unlike when entering a forward contract, the long side of the option contract is re-
quired to pay a price, called the premium, to the individual writing the contract when
the agreement is made. Intuitively, as an individual holding an option never receives
a negative payo at expiry and sometimes gets a positive payo, that individual must
pay a positive price for the option up front.
The total prot to the long side is given by the payo in the preceding equation less
the option premium.
The writer of the option has a prot that is equal to the mirror image of the long-side.
He receives the option premium when the contract is struck but, when the underlying
price exceeds the strike price, his payo is eroded by the dierence between the
underlying price and the strike price as his counter-party exercises the option.
155
Figure 10.3: Long call prot
6
-
Prot
K
S
T

10.4.3 Long put payo


Similar arguments to those above tell us that the payo from holding a long position
in a put option is given by;
Payo
l
= max[K S, 0]
as when the spot price exceeds the strike price the investor chooses not to exercise his
right to sell and when the spot price is below the strike price the option is exercised.
Again, though, one must deduct the put option price/premium from this payo to
get the total prot to the long-side.
The payo to the short side is once again the mirror image of the long side.
Note that, in either put or call case, the long-side of the option has limited downside
risk as he can choose not to exercise. As such he loses at most the premium he paid
for the option. A long position in a call has unlimited upside potential, however.
On the ip-side, an individual who has written a call has the possibility of unbounded
losses.
156
Figure 10.4: Short call prot
6
-
Prot
K
S
T
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
Figure 10.5: Long put prot
6
-
Prot
K
S
T
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
157
Figure 10.6: Short put prot
6
-
Prot
K S
T

10.4.4 Options, leverage and risk


Consider the case of an individual who is long a call option. Given that his payo
is bounded below at the premium he has paid for the option it would seem that this
position is not very risky. If he was long the underlying instead, he could potentially
lose his entire investment.
Does this argument that the option position is not very risky stand up? The answer
is no as the example below should demonstrate.
The stock of Corporation A currently trades at 35 per share. A call option with
strike price 30 is also trading and currently each option costs 5. The following
two portfolios both cost 35;
Portfolio A: a long position in 1 unit of stock
the purchase of 7 calls with strike price 30.
The return proles of the two portfolios are shown in Figure 10.7 where the y-axis
measures the return on each portfolio and the x-axis is the price of the underlying
at exercise. This gure clearly demonstrates the greatly increased risk of the option
portfolio over the stock portfolio.
158
Figure 10.7: Comparing the risk of stock and option positions
0 5 10 15 20 25 30 35 40 45 50
!150
!100
!50
0
50
100
150
200
250
300
350
Long stock
Long call option
For any underlying price at maturity below 30, the option portfolio loses its entire
value while the stock portfolio always retains some value (aside from in the unlikely
case that the stock price falls to exactly zero). On the upside, the option portfolio
returns reach great levels for relatively small up moves of the stock price (e.g. at
40 the option portfolio makes a 100% return). For similarly sized stock moves, the
return on the stock portfolio is much smaller.
The increased risk of the option portfolio over the stock portfolio is due to leverage,
a concept we will return to when pricing options.
10.5 Option combinations: payos and uses
Because of the non-linearity in their payo proles (at maturity), options can be
combined to yield some very interesting payo structures and some useful portfolios.
These portfolios can be built to exploit rising or falling markets, limit upside and
downside exposure and create exposure to volatility (i.e.; big moves in the price of
the underlying in either direction).
159
Well briey look at the following combined option positions;
Bull and bear spreads
Buttery spreads
Straddles and strangles
Strips and straps
10.5.1 Bull and bear spreads
Consider an investor with a mildly bullish view but who wants a portfolio thats
protected against extreme price moves. He can choose one of the following option
combinations.
Bull call spread: combine a long position in a call option with a low strike
(K
1
) with a short position in a call option with a higher strike (K
2
).
Bull put spread: combine a long position in a put option with a low strike
(K
1
) with a short position in a put option with a higher strike (K
2
).
Both of these positions yield the desired portfolio payo prole. A explanation for
the payo prole of the bull call spread is given below.
160
Figure 10.8: Bull spread using Calls
6
-
Prot
K
1
K
2
S
T
@
@
@
@
@
@
@
@
@

Example; payo prole for bull call spread: dene the underlying
price at maturity to be S
T
. The bull call spread consists of a long call
position with strike K
1
and a short call position with strike K
2
where K
2
>
K
1
;
Position Pay-O
S
T
K
1
K
1
S
T
K
2
K
2
S
T
Long 1 Call @ K
1
0 S
T
K
1
S
T
K
1
Short 1 Call @ K
2
0 0 K
2
S
T
Total 0 S
T
K
1
K
2
K
1
Self-assessment exercise: demonstrate that the bull put spread dened
previously gives a similar portfolio payo prole.
161
Figure 10.9: Bull spread using Puts
6
-
Prot
K
1
K
2
S
T
@
@
@
@
@
@
@
@

10.5.2 Bear spreads


An investor who holds a mildly bearish view about the price of the underlying but
wants a position thats not too sensitive to extreme market movements might con-
struct a bear spread.
Bear call spread: combine a short position in a call option with a low strike
(K
1
) with a long position in a call option with a higher strike (K
2
).
Bear put spread: combine a short position in a put option with a low strike
(K
1
) with a long position in a put option with a higher strike (K
2
).
Self-assessment exercise: demonstrate that the bear spread positions have the
desired payo proles.
10.5.3 The Buttery spread
Consider an investor who believes that volatility in the underlying will be low until
maturity (i.e. the price of the underlying will change little from its current value,
either up or down).
162
Figure 10.10: Bear spread using Calls
6
-
Prot
K
1
K
2
S
T

@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
Figure 10.11: Bear spread using Puts
6
-
Prot
K
1
K
2
S
T

@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
163
Buttery spread using calls: go long one call with a low exercise price (K
1
),
short two calls with a medium strike (K
2
) and long one call with a high exercise
price (K
3
).
Buttery spread using puts: go long one put with a low exercise price (K
1
),
short two puts with a medium strike (K
2
) and long one put with a high exercise
price (K
3
).
Again, the worked example for the call spread is shown below.
10.5.4 Example; Payo derivation for call Buttery Spread
Position Pay-O
S
T
K
1
K
1
S
T
K
2
K
2
S
T
K
3
K
3
S
T
Long 1 Call @ K
1
0 S
T
K
1
S
T
K
1
S
T
K
1
Short 2 Calls @ K
2
0 0 2(K
2
S
T
) 2(K
2
S
T
)
Long 1 Call @ K
3
0 0 0 S
T
K
3
Total 0 S
T
K
1
K
3
S
T

0

(

: Since 2K
2
= K
1
+K
3
)
10.5.5 The Straddle
Consider an investor who wants to bet on volatility in the underlying asset i.e. he
wants a portfolio that is valuable both when there are extreme up and down move-
ments in the underlying price.
Straddle: go long a put and long a call with the same strike (K).
Strangle: go long a put with a low strike price (K
1
) and long a call with a
high strike price (K
2
).
164
Figure 10.12: Buttery spread using Calls
6
-
Prot
K
1
K
2
K
3
S
T

A
A
A
A
A
A
A
A
A
A
A
A
A
A
A

@
@
@
@
@

@
@
@
@
@

@
@
@
@
@
Figure 10.13: Buttery spread using Puts
6
-
Prot
K
1
K
2
K
3
S
T
@
@
@
@
@
@
@
@
@

@
@
@
@
@
@
@
@
@
@
@

@
@
@
@
@

@
@
@
@
@

@
@
@
@
@
165
Figure 10.14: Long straddle
6
-
Prot
K
S
T

@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@
@ @

@
@
@
@
@
@
@
@
@
@
@
@
@ @

@
@
@
@
@
@
@
@
@
@
@
@
@ @

Both will pay o when there are extreme up or down moves in the underlying price.
The strangle, however, increases the range of the underlying price where the buyer
of the straddle makes no prots.
10.5.6 Strips and Straps
Finally, consider an investor who again wants to bet on volatility in the underlying
price but with a bullish bias (i.e. he believes that prices will move a lot but believes
that theyre more likely to rise a lot than fall a lot). A dierent investor might want
to bet on high volatility but with a bearish bias.
Strap: combination of two calls and one put, here with the same strike price.
This would appeal to the rst investor.
Strip: combination of two puts and one call with the same strike price this
would appeal to investor two.
All that straps (strips) do is make the portfolio more sensitive to up (down) moves
by increasing the number of calls (puts) held relative to puts (calls). This means that
the portfolio payo prole will be steeper on the upside (downside).
166
Figure 10.15: Long strangle
6
-
Prot
K
1
K
2
S
T
@
@
@
@
@
@
@
@
@

@
@
@
@
@
@
@
@
@

@
@
@
@
@
@
@
@
@

@
@
@
@
@
@
@
@
@

Figure 10.16: Long strip


6
-
Prot
K
S
T

A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A

A
A
A
A
A
A
A
A
A
A

A
A
A
A
A
A
A
A
A
A

167
Figure 10.17: Long strap
6
-
Prot
K
S
T
@
@
@
@
@
@
@
@

@
@
@
@
@
@
@
@
@

@
@
@
@
@
@
@
@
@

@
@
@
@
@
@
@
@
@

10.6 Swaps
The nal type of derivative well encounter is the swap. As the name suggests, all
this type of contract consists of is an agreement to exchange (i.e. swap) the cashows
from two dierent assets. The underlying assets for swap might include the following;
bonds, currencies, equities or equity indices. Well briey discuss the characteristics
of the most familiar swap contracts and how they might be priced in practice.
10.6.1 Vanilla Interest rate swap
The basic mechanics of a vanilla interest rate swap are as follows;
Individual A agrees to pay individual B a oating rate of interest on a given
notional principal over a specic period while individual B promises to pay
A a pre-determined xed rate of interest on the same principal.
The xed rate paid by B is known as the swap rate. Note that, like forwards but
168
unlike options, these swaps are self-nancing in that the contract is structured so
that no money changes hands at inception.
The oating rate paid by A is usually equal to or based upon one of several market
interest rates e.g. LIBOR in London, the prime rate in the US, PIBOR in Paris. For
example, A might oer to pay B LIBOR plus 50 basis points.
One thing to note regarding this structure is that the payment made by A at the end
of a given period is determined by the oating rate measured at the beginning of the
period. Thus, each oating payment is eectively known one period in advance.
Example; vanilla interest rate swap: at the beginning of 2003, company
A agrees to pay LIBOR to company B who agrees to pay 6% xed to A. The
notional principal is 1,000,000. The period is 3 years and the payments
are made semi-annually.
Date LIBOR Floating payment Fixed payment Net (to A)
30/06/2003 5.80 2900 3000 100
31/12/2003 6.10 3050 3000 50
30/06/2004 6.20 3100 3000 -100
31/12/2004 5.90 2950 3000 50
30/06/2005 5.70 2850 3000 150
31/12/2005 5.80 2900 3000 100
Notes: only the net cashow is exchanged. The column labelled LIBOR
gives beginning of period LIBOR. The notional principal is not exchanged.
10.6.2 What use is a swap?
First of all, one can use a swap to transform a xed rate loan into a oating rate loan
and vice-versa. If youre currently paying a xed interest rate but would prefer to
pay a oating rate. you enter into a swap where you receive xed and pay oating
the xed sides cancel out and youre left with a oating payment. Similarly, an
asset paying a xed rate can be transformed to pay a oating rate (and vice versa)
169
using a swap.
Firms might want to perform such changes because they might nd themselves in a
position where its relatively cheap for them to borrow xed, for example, but theyd
really prefer to paying oating interest. Then they can take the cheap xed rate loan
and transform to oating with the swap. these arguments often rely on some kind of
comparative advantage idea.
10.6.3 Currency swaps
With this contract, instead of exchanging xed and oating rates in the same cur-
rency, parties exchange xed rate payments on dierent currencies. In this case,
though, the principal is exchanged as exchange rate variation through the life of the
contract means that the nal principal payments do not necessarily cancel out.
Again, these contracts are often motivated by a comparative advantage argument.
A British rm might get relatively cheaper access to Sterling loans than a Japanese
rm. Conversely, the Japanese rm gets better terms on xed rate Yen borrowing.
Thus, if the British rm wants to borrow Yen and the Japanese rm Sterling, they
might be better oer borrowing their domestic currencies and entering into a swap.
170
Example; currency swap: at the beginning of 2001 a British rm agrees
to receive 6% on a principal of 100,000 from a Japanese rm in exchange
for paying 8% on a notional of Yen 20,000,000 (as the current exchange rate
is Yen200 = 1). Payments are assumed to be annual covering a period of
5 years. the actual cashows are;
Date Sterling payment Yen payment
31/12/2001 6,000 Y1,600,000
31/12/2002 6,000 Y1,600,000
31/12/2003 6,000 Y1,600,000
31/12/2004 6,000 Y1,600,000
31/12/2005 106,000 Y201,600,000
Note: obviously the value of the net payments to the parties at the various
points depends on the exchange rates at those points.
10.6.4 Intermediaries
Swap contracts are usually intermediated by a nancial institution. The institution
will charge a small fee for the match-making service it provides i.e. bringing the two
sides of the contract together. In such a case, the intermediary enters 2 contracts, one
with each of the counterparties. the net eect of these 2 contracts is that (usually)
both parties will end up paying slightly more than they might have done otherwise.
The slight ination of the rates to be paid just reects the intermediarys fees.
The fees also compensate the intermediary for the risk of not nding both sides of
the swap simultaneously. Maybe the intermediary can only nd a xed payer and
not a oating payer. In this case the intermediary still enters into the contract with
the xed payer and waits for a oating payer to come along in the interim the
intermediary hedges any interest rate risks its exposed to. This is called warehousing.
171
10.7 Pricing derivatives
Now that we know how the basic classes of derivative assets are structured, we can
start to think about how we might price them.
Throughout this section we will use the same underlying pricing principle - absence
of arbitrage pricing - for all of our examples. This is exactly the same principle that
we discussed with reference to bond valuation much earlier in the course and the
APT.
We briey review the mechanics of this method of pricing in the following section.
10.7.1 Pricing by absence of arbitrage
All of our derivatives pricing, whether in discrete or continuous time, will use arbitrage
pricing techniques. Let us rst redene what we mean by arbitrage.
Denition: an arbitrage opportunity is said to exist when one can con-
struct one of the following;
A portfolio with zero set-up cost (i.e. a zero portfolio price) but
positive subsequent payos.
A portfolio with a negative set-up cost and zero payos thereafter.
The way were going to price assets is as follows.
We assume that investors are smart (i.e. greedy) enough to see any arbitrage oppor-
tunities and take them. These smart investors, or arbitrageurs as they are sometimes
known, will be buying cheap assets and selling expensive assets in their exploitation of
the arbitrage opportunity. This will tend to raise the prices of the cheaper assets and
reduce those of the expensive securities and thus reduce the scope for arbitrage. Only
when the arbitrage opportunity has completely disappeared will the asset prices have
no tendency to move upwards or downwards. Thus, we focus on these no-arbitrage
situations to price our derivative assets.
172
In practice, we do the following. We form portfolios of assets that have identically
zero payos regardless of how the future unfolds. By absence of arbitrage, the price
of this portfolio must be zero and thus the portfolio-weighted sum of the individual
asset prices must be zero. If we know the prices of all but one of the assets in the
portfolio then this restriction allows us to work out the price of the last asset.
10.7.2 Implications of absence of arbitrage
A rst key implication of absence of arbitrage is as follows;
The law of one price: if two portfolios have identical payos in all states
of nature, they must have the same price.
A second implication is;
The law of payo dominance: if portfolio A guarantees a payo at
least as great as portfolio B in all states of nature, then portfolio A must
command a greater price than portfolio B.
10.8 Forwards and futures prices
We will ignore the dierences in the timing of cashows between futures and forwards
and treat forwards and futures as being identical.
Setup: consider a T-period forward contract on an asset that pays no cashows to
an investor between the current date and T. The current price of the underlying is
S. Denote the underlying price at maturity with S
T
. The delivery price agreed in
the forward contract is F. If r
T
is the T-period spot rate, what is the price of the
forward contract?
173
Approach: we know that for a forward contract no cashows are exchanged at the
date on which the contract is struck. To rule our arbitrage we must manipulate the
delivery price of the contract (F). We rst construct a portfolio that replicates the
payo from being long the forward, assuming that T-year zero-coupon bonds exist
with a face value of 1
Replicating portfolio: buy one unit of stock and sell F T-year zero-coupon bonds.
The costs and payos of the forward and the replicating portfolio are as follows;
Time Forward Replicating p/f
t
0
0
F
(1+r
T
)
T
S
t
0
+T S
T
F S
T
F
Note that, by design, the replicating portfolios payo is always the same as the
forward contracts. However, the price of entering into the forward is zero. Thus, by
absence of arbitrage, the cost of the replicating portfolio must be zero. Hence;
F
(1 +r
T
)
T
S = 0
This nails down the no-arbitrage value for the delivery price. The no-arbitrage for-
ward price on an asset that pays no dividends or coupons is;
F = S (1 +r
T
)
T
(10.1)
If the forward price was any dierent from this value an arbitrage opportunity would
be available.
10.8.1 Example; forward pricing
Assume that the current price of MSFT is $26. This stock nevers pays dividends.
The current 2-year spot rate is 4.25%. What is the no-arbitrage price of a 2-year
forward on MSFT stock?
F = 26 (1 + 0.0425)
2
= $28.26
174
The current market price of a 2 year forward on MSFT stock is $28.50. How might
one exploit this?
The forward price is too high. Thus sell the 2-year forward in the market with a
delivery price of $28.50. Simultaneously, buy a unit of stock in the stock market at
$26, nancing this purchase by borrowing at the 2-year spot rate. In 2 years time,
one uses the unit of stock to deliver on the forward contract and receives $26.50 in
cash. Also, ones debt must be repaid and this totals $28.26 (make sure that you can
verify this), leading to a prot, in 2 years, of $0.24.
10.8.2 Forward prices for dividend paying assets
We can extend our analysis to deal with forwards on assets that pay dividends (or
coupons) also. Using the same notation as above, further assume that the asset were
looking at will pay a known dividend during the lifetime of the T-period forward.
Denote the present value of this dividend payment by I. Then the no-arbitrage
T-period forward price on the dividend paying asset is;
F = (S I) (1 +r
T
)
T
(10.2)
Note that this analysis generalises to the cases where there are multiple dividend
payments, where I is then the present value of the entire set of dividend payments
and also the case where the dividend payments are uncertain. In the latter case then
I should be interpreted as the present value of the expected dividend payments.
10.9 Binomial option pricing
We now move to the valuation of option contracts. Our goal will be to derive formulae
that will allow us to determine a no-arbitrage value for an options premium (where
the premium is the price paid when an option is purchased).
To start, we will price options using binomial models. The term binomial refers to
the assumption were going to make in this section regarding the way in which the
price of the underlying evolves. Specically we assume that ;
175
The Binomial assumption: in a single period, the underlying price can
move from its current level (S) to one of only two new levels. Either the
price moves up by a factor u, thus reaching the level uS, or it moves down
by a factor d to the level dS.
This seems pretty restrictive prices can only move to one of two dierent levels
in the next period but its easy to generalise to many periods and to think of the
period length shrinking to cover a matter of seconds so that the process above starts
to resemble something more reasonable. An example is shown in Figure 10.18.
Call specication: based on this process for the underlying, we will now price a
one-period call option via absence of arbitrage. The exercise price for the call option
is X and it is European such that it can only be exercised at the end of the period.
Figure 10.19 shows the payo structure of this call in the binomial world.
Risk-free asset: we assume that there exists a one-period zero coupon bond with
face value 1. The one-period spot rate is r and so the price dynamics of the bond
are as shown in Figure 10.20.
Figure 10.18: Binomial process for underlying: S = 100, u = 1.25, d = 0.80
t

H
H
H
H
H
H
H
H
H
H
H
Stock Price = S = 100
Stock Price = uS = 125
Stock Price = dS = 80
Up
Down
t
0
= 0 t
1
= 1
176
Figure 10.19: Binomial setting: payo to call option struck at X
t

H
H
H
H
H
H
H
H
H
H
H
Premium = c =?
Payo = c
u
= max[0, uS X]
Payo = c
d
= max[0, dS X]
Up
Down
t
0
= 0 t
1
= 1
10.9.1 Set-up of the replicating portfolio
Consider a replicating portfolio where we purchase units of stock and we sell N
one-period zero coupon bonds. The payo prole of this portfolio is shown in Figure
10.21.
By comparison of Figures 10.21 and 10.19, for this portfolio to exactly replicate the
payo of the call option, the following conditions must hold;
c
u
= uS N
c
d
= dS N
Given that c
u
, c
d
, u, d and S are known, we can solve this pair of equations for the
quantities of stock () and bonds (N) in the replicating portfolio.
The solutions are;
=
c
u
c
d
(u d)S
, N =
dc
u
uc
d
u d
177
Figure 10.20: Binomial setting: dynamics for one-period zero-coupon bond
t

H
H
H
H
H
H
H
H
H
H
H
Price =
1
1+r
Payo = 1
Payo = 1
Up
Down
t
0
= 0 t
1
= 1
Now we know the precise composition of the replicating portfolio, we impose no-
arbitrage and argue that the price of the call option must be equal to the cost of
setting up the replicating portfolio. Thus;
c = S
N
1 +r
, where =
c
u
c
d
(u d)S
, N =
dc
u
uc
d
u d
(10.3)
If we dene q =
(1+r)d
ud
then the preceding equation can be made to look a bit more
straightforward. We can write the no-arbitrage call price as;
c =
1
(1 +r)
[qc
u
+ (1 q)c
d
] (10.4)
Note the following features of this derivative price;
The nal price for the call is entirely unrelated to the probabilities of the up
or down move in the binomial process for the underlying. Indeed, we have not
even mentioned probabilities until this point. Intuitively, this is due to the fact
that out replicating portfolio is designed to match the option payo exactly
in all possible outcomes. It doesnt matter how likely any outcome is or how
relatively likely any pair of outcomes are.
178
Figure 10.21: Binomial setting: dynamics for replicating portfolio
t

H
H
H
H
H
H
H
H
H
H
H
Price = S
N
1+r
Payo = uS N
Payo = dS N
Up
Down
t
0
= 0 t
1
= 1
The above process, does not rely on the derivative were pricing being a call
option at all. It works for any derivative. As long as you can work out the
payo of the derivative in the up-state (c
u
) and the payo in the down-state
(c
d
) then you can directly apply the nal formula (i.e. equation (10.4)) to
whatever derivative you want.
10.9.2 Replication versus risk-neutral pricing
Equations (10.3) and (10.4) give two alternative ways of the pricing the call and,
although they are fundamentally identical and will always produce the same answer,
they are given dierent labels. The method delivered by equation (10.3) is often called
pricing by replication as one explicitly works out the composition of the replicating
portfolio and then solves for the price of the call.
The method delivered by equation (10.4) is called risk neutral pricing. To see where
this name comes from, note that as we required d < (1 +r) < u, it must be the case
that q is between zero and one. Thus we can interpret q as if it were a probability.
Indeed, q is referred to as the risk-neutral probability of an up-move (and 1 q is
the risk-neutral probability of a down-move). This is because, in a world where all
investors were risk-neutral, q would have to be the probability of an up-move and
179
1 q the probability of a down-move.
Now, examine the structure of equation (10.4). If we think of q as if it were the
probability associated with an up-move then we can interpret the formula as saying
that the price of the call is equal to the expected payo of the call under the risk-
neutral probability structure (the numerator), discounted back to the current time
at the risk-free rate (the denominator). To re-iterate;
Risk-neutral pricing: any derivative can be priced by calculating its
expected payo under the risk-neutral probability structure and discounting
this expected payo back to the present at the risk-free rate.
Note again, that the true probabilities of up and down-moves are entirely irrelevant.
the only probabilities that enter are the risk-neutral probabilities and these are not
real probabilities.
10.9.3 Example; binomial call pricing
Assume that the risk-free rate is 5%. A stock has current price $20. In one period
its price will either rise to $25 or fall to $16. Price a one-period call option on this
stock with exercise price $22.
Well, rst we must work out the size of the up and down moves. We have;
u =
25
20
= 1.25 , d =
16
20
= 0.8
Next we need to know the call option payos in the two states;
c
u
= max[0, uS X] = max[0, 3] = 3
c
d
= max[0, dS X] = max[0, 6] = 0
180
Now, to price the option by replication we compute the values for and N;
=
c
u
c
d
(u d)S
=
3
0.45 20
=
1
3
, N =
dc
u
uc
d
u d
=
0.8 3 1.25 0
0.45
=
16
3
Finally, the price of the call is;
c = S
N
1 +r
=
20
3

16/3
1.05
= 1.5873
We can also work out the option price via the risk-neutral method. The risk-neutral
probability of an up-move is given by;
q =
(1 +r) d
u d
=
1.05 0.8
1.25 0.8
=
5
9
Then the call price is;
c =
1
(1 +r)
[qc
u
+ (1 q)c
d
] =
1
1.05
_
3
5
9
+ 0
4
9
_
= 1.5873
10.10 Black-Scholes option pricing
The previous section relied on a discrete time description of the underlying price for
a derivative, in the sense that time was assumed to pass in steps from one period to
the next and during that period the price of the underlying could move either a xed
amount upwards or a xed amount downwards.
The analysis of the current section makes the alternative assumption that time passes
continuously and the price of the underlying evolves on a continuous basis. We can
view this continuous time setting as the limit of a binomial world where the number
of steps in the binomial tree tends to .
Using this framework and absence of arbitrage, one can derive the famous Black-
Scholes price for a call option. However, the mathematics behind this derivation are
beyond the scope of the current course. Thus will just present the pricing formula,
analyze its properties and look at some examples.
181
10.10.1 Continuous compounding and discounting
Earlier we saw that in a continuous time setting, the continuously compounded value
of A from time t to T is given by;
Ae
r(Tt)
where r is the interest rate. Similarly, continuously discounting an amount B back
to the current time (t) from a given point in the future (T) gives;
Be
r(Tt)
We will use both of these formulae in what follows.
10.10.2 The Black-Scholes formula
The Black-Scholes price for a call option with time to maturity T and strike price X
is given by;
c = SN(d
1
) Xe
r(Tt)
N(d
2
) (10.5)
where S is the current underlying price, is the volatility of the underlying price
process, r is the interest rate and;
d
1
=
ln(S/X) + (r +
2
/2)(T t)

T t
, d
2
= d
1

_
(T t)
Finally, the function N() is the cumulative Normal distribution function.
We can (loosely) interpret that Black-Scholes equation as follows. It turns out that
N(d
s
) can be interpreted as the risk-neutral probability that the option will be exeri-
cised such that the term Xe
r(Tt)
N(d
2
) can be thought of as the expected present
value of the cost of exercising the option (as its a probability a discount factor
the exercise price). Similarly, the rst term SN(d
1
) can be interpreted as the
expected value of a variable that pays o max[S X, 0] is a risk-neutral world. Thus
182
the option price is just the expected gain from exercising less the expected cost of
exercise.
There are 5 unknowns that must be input to the BS equation in order to give the call
price. One needs to know the strike and time to maturity of the option, the current
underlying price, the underlying volatility and the interest rate. The following table
tells us how the call price (and also the price of a BS put) vary with each of these
parameters;
Table 10.1: Properties of the BS pricing equations
Increase in call Put
S Increase Decrease
X Decrease Increase
Increase Increase
T Increase Increase
r Increase Decrease
Some of these eects are obvious and others more subtle. Explanations are given
below;
Call prices increase as the stock price rises as this increases the chance that the
call is in the money. The converse is true for puts.
Call prices fall as the exercise price is increased as this reduces the probability
of exercise. Again the converse holds for puts.
Rises in both and T tend to increase call and put prices as both increase the
chances that the option ends in the money.
Increased interest rates increase the value of calls as this reduces the present
value of the exercise price that the holder must pay. An interest rate fall reduces
put values as it reduces the present value of the exercise price that the holder
receives if exercise occurs.
10.10.3 Measuring volatility
Of the 5 parameters in the preceding table all easy to obtain aside from the volatility
parameter . There are two ways to obtain values for this parameter.
183
Estimate volatility from the history of the underlying price. Given a time-
series of daily prices on the underlying, rst compute daily returns and then
compute their time-series standard deviation. An annualised volatility measure
for that underlying can then be calculated as the estimated standard deviation
multipled by

252.
Retrieve the volatility parameter from the price of an already traded option
on the same underlying. If one knows that price, the options strike, its time
to maturity, the current interest rate and the underlying price then the Black-
Scholes formula can be inverted to give the volatility parameter that prices the
option. This is called an implied volatility. Then use the implied volatility to
price another option on the underlying.
10.10.4 Example; BS call pricing
Assume that the risk-free rate is 5%. A stock has current price $20 and its annualized
historical volatility is 22.5%. Price a call with exercise price $22 and one-year to
maturity.
First we should work out the values for d
1
and d
2
using the previous equations. We
get the following;
d
1
=
ln(20/22) + (0.05 + 0.225
2
/2)(1)
0.225

1
= 0.0889
d
2
= d
1
0.225

1 = 0.31388
Then using the Excel function NORMSDIST, that returns the value of the standard
Normal CDF we get;
N(d
1
) = 0.4646 , N(d
2
) = 0.3768
Finally, plugging into equation (10.5) we get;
184
c = SN(d
1
) Xe
r(Tt)
N(d
2
) = 20 0.4646 22 e
0.051
0.3768 = 1.4063
10.11 Arbitrage and option price relationships
Finally for this chapter we provide information on some relationships that place
restrictions on options prices, either with respect to other option prices or with respect
to the underlying. These are all based on no-arbitrage.
We look at;
Put-call parity
Lower bounds on call prices
Upper bounds on call prices
10.11.1 Put-call partity
An arbitrage relationship between put and call prices on the same stock. Put-call
parity dictates that the following condition must hold for a put and a call on the
same underlying and with the same strike price (X) and time to maturity (T);
S +p = c +Xe
rT
(10.6)
In the above I have used continuous discounting but the condition is unaected by
the use of discrete compounding/discounting. We can see why put-call parity must
hold based on the following argument. Consider two portfolios.
Portfolio A consists of the underlying and a put option with T periods to
maturity and struck at X.
Portfolio B consists of a call struck at X and with time to maturity T plus cash
to the value of Xe
rT
.
185
The payo proles of these two portfolios are given below;
Portfolio S
T
X S
T
> X
A S
T
+ max[0, X S
T
] = X S
T
+ max[0, X S
T
] = S
T
B max[0, S
T
X] +X = X max[0, S
T
X] +X = S
T
Note that, regardless of the nal value of S
T
, both portfolios always give the same
payo. Thus, no-arbitrage tells us that that their prices must be identical. The price
of portfolio A is S+p and that of B is c+Xe
rT
. Equating these gives us the put-call
parity condition.
Note that we can use put-call parity to derive the Black-Scholes price for a put option.
If we plug the Black-Scholes price for a call into equation (10.6) and re-arrange we
get;
p = Xe
r(Tt)
N(d
2
) SN(d
1
) (10.7)
where d
1
and d
2
are as previously dened.
10.11.2 Bounding a call price with the underlying
We now attempt to place some bounds on the prices that a call can take using the
price of the underlying. First, however, an obvious bound on the price of a call is
that it must always be positive. As the payo of a call option is never negative, the
price of a call can never be negative. An asset with weakly positive payos must
always have a weakly positive price;
c 0
Second, the payo of the call option is always smaller than the terminal stock price.
This implies that holding the stock is always at least as good as holding a call and,
accordingly, the price of the stock must be weakly greater than the price of the call.
Thus;
186
c S
Last of all, we can derive another lower bound on the value of a call option. Consider
the following two portfolios;
Portfolio A: consists of a call option with time to maturity T and struck at X,
and cash to the value of Xe
rT
.
Portfolio B: a unit of the stock
What are the payos of these two portfolios at the maturity date;
Portfolio S
T
X S
T
> X
A max[0, S
T
X] +X = X max[0, S
T
X] +X = S
T
B S
T
S
T
Note that the payo of portfolio A is always at least as large as that of portfolio B.
Thus, the price of portfolio A must be at least as big as that of B. This implies that;
c +Xe
rT
S c S Xe
rT
Thus, the price of the call is always weakly greater than the current price of the stock
less the present value of the exercise price.
10.12 Conclusion
This chapter has introduced the 3 main types of derivative contract and presented
some simple pricing examples for such contracts. This topic will be picked up and
expanded in the second term Derivatives course where more complex derivatives will
be introduced and the pricing analysis better developed. That course will also discuss
the management of option portfolios.
187
188
Bibliography
Barber, B., R. Lehavy, M. McNichols, and B. Trueman, 2001, Can Investors Prot
from the Prophets? Consensus Analyst Recommendations and Stock Returns,
Journal of Finance, 56, 531563.
Brock, W., J. Lakonishok, and B. LeBaron, 1992, Simple technical trading rules and
the stochastic properties of stock returns, Journal of Finance, 47, 17311764.
Campbell, J., A. Lo, and A. Mackinlay, 1996, The econometrics of nancial markets.
Princeton University Press, Princeton, NJ, USA.
Chen, N., R. Roll, and S. Ross, 1986, Economic forces and the stock markets: testing
the APT and alternative asset pricing theories, Journal of Business, 59, 383403.
deBondt, W., and R. Thaler, 1985, Does the stock market overreact?, Journal of
Finance, 40, 793805.
Fama, E., 1965, The behaviour of stock market prices, Journal of Business, 60,
420429.
, 1991, Ecient capital Markets II, Journal of Finance, 46, 15751617.
Fama, E., and K. French, 1992, The cross-section of expected stock returns, Journal
of Finance, 47, 427465.
, 1993, Common risk factors in the returns on stocks and bonds, Journal
of Financial Economics, 33, 356.
189
Jensen, M., 1978, Some anomalous evidence regarding market eciency, Journal
of Financial Economics, 6, 107147.
King, M., and A. Roell, 1988, Insider trading, Economic Policy, 6, 163193.
Levich, R., and L. Thomas, 1993, The Signicance of Technical Trading Rule Prots
in the Foreign Exchange Markets: A Bootstrap Approach, Journal of Interna-
tional Money and Finance, 56, 451474.
Malkiel, B., 1992, Ecient Markets Hypotheses, in New Palgrave Dictionary of
Money and Finance. Macmillan.
Markowitz, H., 1952, Portfolio selection, Journal of Finance, 7, 7791.
Roberts, H., 1967, Statistical versus clinical prediction of the stock market, Un-
published manuscript.
Roll, R., and S. Ross, 1980, An empirical investigation of the arbitrage pricing
theory, Journal of Finance, 35, 10731103.
Seyhun, H., 1986, Insiders prots, costs of trading and market eciency, Journal
of Financial Economics, 16, 189212.
Sweeney, R., 1988, Some new lter rule tests: methods and results, Journal of
Financial and Quantitative Analysis, 23, 285300.
190

You might also like