You are on page 1of 86

A REFRESHER COURSE [FUNDAMENTAL IN QUANTITATIVE METHOD SERIES]

2014
QUANTITATIVE
METHODS IN FINANCE
A PREPARATORY COURSE FOR CFA/FRM
MS HAFEEZ




BASIC MATH BOOKS SERIES
MY E B O O K . C O M

2 | P a g e

Prefatory Introduction
This little book has been complied for those who need a breather / refresher course in basic
math and statistical concepts which are the basic requirement for taking the CFA/FRM exam.

This booklet will also be helpful for undergraduate students taking courses in basic math,
financial management, or economics courses.

The booklet uses a distinct color scheme to highlight important topics and formulae used

This book is a copy righted material but is being provided freely for everyone who wishes to
down load it. Subsequent to this first introductory edition, I, the author, reserve the right to
change my pricing or payment policy.

All the concepts have been explained in plain language with illustrations to make the
concepts easily graspable by anyone who reads it.

We assume no prior knowledge of Statistics or high level mathematics, a simple exposure in
math up to school level is sufficient to grasp the core of the subject.

A lot of people and my pupils have helped me and in compiling and writing the text, to all of
whom I am greatly indebted and express my gratitude openly.

Kindly give me more feed back for improvement in the text, and suggestions on my private
email as under

andylau1955@hotmail.com

Have A Happy reading.

Thanks

Professor Alberetto Albarak
University of Turkey
Department of Undergraduate Studies





3 | P a g e

Quantitative Methods for Finance
In this course we will learn about the quantitative techniques essential for financial analysis.
You will learn about time value of money, discounted cash flow applications, statistical
concepts, probability concepts, probability distributions, sampling and estimation, hypothesis
testing, and technical analysis.
Course Content
1. Time Value of Money
2. Discounted Cash Flow Applications
3. Statistical Concepts and Market Returns
4. Probability Concepts
5. Common Probability Distributions
6. Some Additional Material
7. Epilogue

Introduction Time Value of Money
Lets say that you are given a choice to receive $100 today or $100 one year from now. Which choice
will you prefer? The more likely answer is that you will want to receive $100 today. You could
purchase something with that $100 today or you could deposit it in a savings account. If you deposit it
in a savings account or any other form of investment, what you will get after one year is likely to be
more than the $100 that you started with. This means that money is more valuable today than it is
tomorrow or after one year.
Another important principle inherent in the time value of money concept is that your investments earn
compound interest. The investments that you make will not only earn interest on the original principal
but will also earn interest on the interest that has been accumulated over the period.
Your objective in this reading should be to be able to solve the time value problems as quickly as
possible using the financial calculator prescribed by CFA Institute. You may be asked to calculate the
present value or future value of the cash flows arising from different types of investments.
Use of Calculator
For the CFA exam, there are only two calculators allowed by the CFA Institute. The candidates can
buy either of these and must carry them to the Exam Centre on exam day. The two calculators are:
Texas Instruments BA II Plus Financial Calculator (including BA II Plus Professional)
HP 12C Financial Calculator (including the HP 12C Platinum, 12C Platinum 25th
anniversary edition, 12C 30th anniversary edition, and HP 12C Prestige)
In this reading we will demonstrate the use of BA II Plus calculator for solving time value of money
problems.




4 | P a g e



Interest Rates
Interest rates are how we measure the time value of money. While making an investment, an investor
will need to know the interest rate that the investment will earn. The interest rates can be interpreted
in many ways.
Required Rate of Return
Required rate of return is the minimum return that an investor demands for a specific asset based on
its riskiness. This is the minimum interest rate at which investors will be willing to invest or lenders
will be willing to lend their money.
Opportunity Cost
The required rate of return also reflects the opportunity cost of forgoing the next best investment.
Opportunity cost is what a person sacrifices when he chooses one option over the other. Say you
decided to spend the money (current consumption). If investing that money instead of consuming it
earned you an interest rate of 6%, then 6% is the opportunity cost.
Discount Rate
The interest rates are also referred to as the discount rates and are used to calculate the present value
of future cash flows. If you are expecting to receive $1,000 after one year, you will use the discount
rate to calculate the present dollar equivalent of that future payment. Generally, a single discount rate
is used for all future period cash flows.
When calculating the intrinsic value of a stock, the investor will apply a discount rate that is based on
the risk-free rate of return plus some equity risk premium.

Interest Rate Equation
The required interest rate that an investor earns from an investment is made up of various
components. The general interest rate equation is expressed below:

The nominal risk-free rate itself is expressed as the sum of real-risk free rate and inflation premium.
It is important to understand the difference between the nominal and real risk-free interest rates.
Nominal interest rates are what we observe everyday as published by banks and other financial
institutions. For example, when a T-bill pays 6% interest that is the nominal risk-free interest rate. It
already includes the premium for expected inflation. On the other hand, real interest rates take
purchasing power parity into consideration. This is the rate, which tells how much more you will be
able to buy with your grown investment after one year.
So, the minimum that you expect from a risk-free investment is the nominal risk-free rate. Apart from
that you will also expect a risk premium for various types of additional risks that you take by
investment in a particular asset or security.

5 | P a g e



These risks come in many forms and a premium for each should be added to the risk-free rate to arrive
at the required rate of return from an investment. The important types of risk to be taken into
consideration are:
Default Risk: The risk that the borrower will be unable to meet its payment obligations or
keep up with the terms of payments.
Liquidity Risk: The risk that the investment is less liquid and that the investor may have to
sell the bond at a price lower than the expected price.
Maturity Risk: The risk that the longer the maturity of a bond, its price will be more volatile.
A premium for each of these risks is added to the nominal risk-free rate (such as T-bill rate) to arrive
at the required rate of return. We can rewrite the interest rate equation as follows:

Effective Annual Yield
When you go to a bank inquiring about the deposit rates, the rates specified by the bank can be
expressed in two ways: nominal interest rate and the effective annual rate (also called effective annual
yield).
The difference between the two is that the nominal rate does not take compounding into
consideration, while the effective annual yield does.
Consider an investment of $100 at a nominal rate of 10% compounded monthly.
The future value of the investment will be:

The effective yield will be the absolute increase as a percentage of the principal invested.
Therefore, the effective annual yield will be:

Since the effective yield considers compounding effect, it will always be greater than nominal yield.
The effective yield can be calculated using the following formula:


6 | P a g e


Why Calculate Effective Annual Yield?
Effective yield is useful when you are considering various investment options where the interest rates
are expressed at different compounding rates. In such a situation, you can convert all the rates into
effective annual yields and then make an informed decision.
For example, assume that you have a choice between investing in Bond A offering a nominal interest
rate of 5% compounded semi-annually, and another Bond B offering a nominal interest rate of 4.9%
compounded monthly. Since the compounding periods are different, a direct comparison is difficult.
Therefore, will calculate the effective annual yields for both the bonds:

We can see that the effective yield for Bond B is higher, so, thats a better investment.
The nominal interest is also known as Annual Percentage Rate (APR). The Effective Annual Yield is
also known as Annual Percentage Yield (APY).
Time Value of Money for Different Compounding Frequencies
Lets first review the time value money concept using a very simple example.
Example 1
Lets say you have $2,000 to invest. You decide to invest it for 3 years in an account that pays you an
interest of 6% per annum. How much will your investment grow to in 3 years?
We are calculating the future value of an investment after 3 years. This will be calculated as follows:

Example 2
Your target is to have $10,000 saved in your account in 5 years. How much money should you invest
now to reach your target in 5 years when your investment account earns you 8% per annum?
We are calculating the present value of a future cash flow. This will be calculated as follows:



7 | P a g e

This means that if you invest 6805.83 now for 5 years at 8% interest rate per annum, you will receive
$10,000 at the end of 5 years.
A common assumption in both the above problems was that the frequency of compounding was
annual, that is, the interest is compounded only annually. However, this is not always the case. The
frequency of compounding could be anything, most commonly being, monthly, quarterly, semi-
annually, or annually. Lets look at how our future value and present value will change if we use a
different frequency of compounding.
Example 1 (With Quarterly Compounding)
In our first example, if the compounding frequency was quarterly, then how much will our investment
grow to?
Step 1: Calculate the quarterly rate
Quarterly rate = 6%/4 = 1.5%
Step 2: Calculate number of compounding periods
Compounding periods = 3 years * 4 = 12 periods
Step 3: Calculate Future Value

As you can see, the future value based on quarterly compounding is more than future value based on
annual compounding.
Note that we could also calculate the effective annual yield and then calculate future value as shown
below:

Note that both the methods produce same results.
Example 2 (With MONTHLY Compounding)
In our second example, if the compounding frequency was monthly, how much should we invest now
to reach our target of $10,000 in 5 years with an annual interest rate of 8%?
The monthly rate is 8%/12 = 0.667% and the number of compounding periods is 5*12 = 60.

As you can see, with monthly compounding we need to invest less to reach our target.
Future Value of a Single Cash Flow
Future value of a single cash flow refers to how much a single cash flow today would grow to over a
period of time if put in an investment that pays compound interest.

8 | P a g e

The formula for calculating future value is:

Example
Calculate the future value (FV) of an investment of $500 for a period of 3 years that pays an interest
rate of 6% compounded semi-annually.

We can also solve this problem using the calculator as shown below:
Calculator Usage: Calculating Future Value of an Investment
Calculator Variables The BA II Plus calculator has the following five variables for Time
Value of Money (TVM) functions.
N = Number of Periods (mT in our formula)
I/Y = Interest Rate Per Year (r)
PV = Present Value
FV = Future Value
PMT = Payment
The calculations are simple; you input the values that you know, and calculate the unknown.
To assign a value to a TVM variable, key in the number and press a TVM key N, I/Y, PV,
PMT, and FV.
To change the number of payments (P/Y) press 2nd, key in the number and press ENTER.
To calculate the unknown value, press Compute (CPT) and then press the key for unknown
variable.
In our above example, enter PV = 500, change P/Y = 2 (semi-annual compounding), I/Y = 6, N

9 | P a g e

= 6.
Then press CPT > FV. We get:
FV = 597.026
Similarly we can calculate the Future Value for any compounding frequency.

Present (Discounted) Value of a Single Cash Flow
Present value of a single cash flow refers to how much a single cash flow in the future will be worth
today. The present value is calculated by discounting the future cash flow for the given time period at
a specified discount rate.
The formula for calculating present value is:

Example
Calculate the present value (PV) of a payment of $500 to be received after 3 years assuming a
discount rate of 6% compounded semi-annually.

Calculator Usage: Calculating Present Value of a Cash Flow
We can also solve this problem using the calculator as follows:
In our above example, enter FV = 500, change P/Y = 2 (semi-annual compounding), I/Y = 6, N
= 6.

10 | P a g e

Then press CPT > PV. We get:
PV = 418.74

PRESENT VALUE AND FUTURE VALUE OF ORDINARY ANNUITY
An annuity refers to a series of equal cash flows that occur periodically such as monthly, quarterly or
annually. For example, an investment that gives you fixed monthly payments for a specified period.
There are two types of annuities, namely, ordinary annuities and annuities due. In an ordinary annuity,
the first cash flow occurs at the end of the first period, and in an annuity due, the first cash flow
occurs at the beginning (at time 0).
The present value and future value of these annuities can be calculated using a simple formula or
using the calculator.
Future Value of an Ordinary Annuity
Lets say we have an ordinary annuity that pays $500 every year for the next 5 years. The expected
rate of return is 8%. The future value of this annuity can be represented as follows:

This can be calculated using the following formula:






11 | P a g e


While you can use the above formula to calculate the future value of annuity, you can also calculate
the future value using the BAII Plus calculator. Note that in our example, m = 1, since the
compounding frequency is 1.

Calculator Usage
Enter PMT = $500, N = 5, I/Y = 8%.Since compounding frequency is 1, set Number of
Compounding Periods (C/Y) to 1 by pressing [2nd][P/Y][Down Arrow].
Since its an ordinary annuity, we should set End-of-period payments [END]. This can be set
by pressing the key [2nd][BGN]
To compute the future value, press the key CPT > FV
FV = $2933.2

Present Value of an Ordinary Annuity
Calculate the present value of an ordinary annuity that pays $500 at the end of each year for the next 5
years. The discount rate is 8%.

Calculator Usage
This can be calculated using the TVM functions of BAII Plus calculator as follows: PMT =
500

12 | P a g e

N = 5
I/Y = 8%
To compute present value, press the key CPT > PV.
PV = 1996.355
Without the calculator, you would calculate this as follows:


PRESENT VALUE AND FUTURE VALUE OF ANNUITY DUE
In an annuity due, the first cash flow occurs at the beginning (at time 0). We can use our BA II Plus
calculator to calculate the present value and future value of the annuity due using the same procedure
as above, just by making one minor adjustment.
By default the payment period in the calculator is set to END (End-of-period payments). However, for
annuity due, the payment occurs at the beginning of the period. So, we need to change the payment
period to the beginning BGN (Beginning-of-period payments). To make this change, follow the
following steps:
On your calculator, press: [2ND][BGN][2ND][SET]
The mode is now changed to BGN and you will see BGN displayed on the upper right corner of the
display.
To switch back to END mode, repeat the above steps. Once the mode is changed, BGN will disappear
from the screen.
We will use the same examples as we used for ordinary annuity and calculate the PV and FV of the
annuity due.

FUTURE VALUE OF AN ANNUITY DUE
An annuity due pays $500 every year for the next 5 years. The expected rate of return is 8%. The
future value of this annuity can be calculated as follows:
Calculator Usage

13 | P a g e

Since its an annuity due, we should set payment period to beginning-of-period payments
[BGN].
Enter PMT = $500, N = 5, I/Y = 8%.
Since compounding frequency is 1, set Number of Compounding Periods (C/Y) to 1 by
pressing [2nd][P/Y][Down Arrow].
To compute the future value, press the key CPT > FV
FV = $3167.965

PRESENT VALUE OF AN ANNUITY DUE
Calculate the present value of an ordinary annuity that pays $500 at the end of each year for the next 5
years. The discount rate is 8%.
Calculator Usage
This can be calculated using the TVM functions of BA II Plus calculator as follows:
PMT = 500
N = 5
I/Y = 8%
To compute present value, press the key CPT > PV.
PV = $2,156.063

PRESENT VALUE OF A PERPETUITY
Perpetuity is a type of annuity that pays equal cash flows that occur periodically such as monthly,
quarterly or annually for an infinite period of time.

The present value of an annuity is calculated using the following formula:

Assume that a perpetuity pays $500 per year. The rate of return is 8%. The present value of this
perpetuity is calculated as follows:

14 | P a g e



If the investor invests $6,250 in the perpetuity paying 8% rate of return, he will receive a payment of
$500 per year for an infinite period.
PRESENT VALUE AND FUTURE VALUE OF UNEVEN CASH FLOWS
We have looked at the PV/FV calculations for single sums of money and for annuities in which all the
cash flows are equal. However, there may be an investment where the cash flows are not equal.

We will now look at how to calculate the PV and FV of such an uneven series of cash flows.
Look at the following cash flows:

Assuming an interest rate of 8%, we will now calculate the present value and future value of this
uneven series of cash flows.
Calculator Usage: Future value
To calculate the future value of this series of cash flows, we will need to treat each cash flow
as an independent cash flow and calculate its future value. We will adopt the procedure that we
used to calculate the future value of a single cash flow.
The following calculations are demonstrated using BA II Plus calculator.
FV1: PV = -500, N = 4, I/Y = 8. CPT > FV = -$680.244
FV2: PV = -600, N = 3, I/Y = 8. CPT > FV = -$755.827
FV3: PV = +1,000, N = 2, I/Y = 8. CPT > FV = +$1,166.400
FV4: PV = +1,500, N = 1, I/Y = 8. CPT > FV = +$1,620
FV5: PV = +2,000, N = 0, I/Y = 8. CPT > FV = +$2,000
Future Value of cash flows = Sum of all Future Values = $3350.328

Calculator Usage: Present value

15 | P a g e

To calculate the future value of this series of cash flows, we will need to treat each cash flow as
an independent cash flow and calculate its future value. We will adopt the procedure that we
used to calculate the present value of a single cash flow.
PV1: FV = -500, N = 1, I/Y = 8. CPT > PV = -$462.963
PV2: FV = -600, N = 2, I/Y = 8. CPT > PV = -$514.403
PV3: FV = +1,000, N = 3, I/Y = 8. CPT > PV = +$793.832
PV4: FV = +1,500, N = 4, I/Y = 8. CPT > PV = +$1,102.545
PV5: FV = +2,000, N = 5, I/Y = 8. CPT > PV = +$1,361.166
Future Value of cash flows = Sum of all Future Values = $2280.177

The present value of the uneven series of cash flows can also be calculated using the Cash Flow (CF)
key and NPV function.
ANNUITIES WITH DIFFERENT COMPOUNDING FREQUENCIES
In all the above examples for annuities, we assumed that the compounding frequency is annual.
However, this may not always be the case and an annuity may have monthly, quarterly, or even semi-
annual compounding. We can solve the time value of money problems for any of these compounding
frequencies using the BA II Plus calculator.
Lets take an example. An ordinary annuity pays $250 every quarter for the next 5 years. The
expected rate of return is 8% per annum. Calculate the present value of this annuity.
We can solve this problem using two methods.
Method 1:
Since payments are made quarterly, change the number of payments per year to 4. Press [2ND][P/Y].
Input 4 and then press ENTER.
Now enter the following values for variables.
N=20 (20 quarters in 5 years)
I/Y = 8 (Interest rate per annum)
PMT = $250
Then compute the Present Value [CPT][PV].
PV = $4087.85
Method 2
Keep the number of periods per as 1 (Annual Compounding).
Enter the following variables:

16 | P a g e

N=20 (20 quarters in 5 years)
I/Y = 2% (Annual interest rate /4)
PMT = $250
Then compute the Present Value [CPT][PV].
PV = $4087.85
Both the methods are accurate and produce the same result. Try both methods and follow the one you
are more comfortable with. One problem in the first method is that you will have to change the
compounding frequency (P/Y) and then reset it after you have solved the problem.
Using a Timeline to Solve Time Value of Money Problems
When solving a time value of money problem, it is sometimes easy to draw a timeline to present the
cash flows on it. Once we have the timeline, we can easily understand the variables and visualize the
present value or future value calculations.
In the previous pages, we demonstrated the time line for an ordinary annuity and for uneven cash
flows.
Lets take one more example to demonstrate the use of a time line.
Example: Loan Payments
You have taken a loan of $10,000 at an annual interest rate of 12% for a period of 2 years. Calculate
the monthly payments you will make on this loan.
The payments (PMT) or Equated Monthly Installments will be paid monthly for the next 24 months.
The above problem can be demonstrated on a timeline as follows:

Calculator Usage
To calculate the monthly payment:
Set compounding frequency to 12 (P/Y)
PV = 10,000
I/Y = 12
N = 24

17 | P a g e

PMT = 470.735



CHAPTER 2
TABLE OF CONTENTS
Discounted Cash Flow Applications
Lesson Topics
Net Present Value
Internal Rate of Return
Conflict Between NPV and IRR (And Problem with IRR)
Holding Period Return (Total Return)
Time-weighted Returns
Money-weighted Returns
How to Calculate Annualized Returns
Yield Measures for Money Market Instruments
Bank Discount Yield
Holding Period Yield (HPY)
Effective Annual Yield for Money Market Instruments
Money Market Yield
Convert One Yield to Another



18 | P a g e

NET PRESENT VALUE
The net present value is the most commonly used method to decide whether to invest in a
project or not.
The net present value of a project is equal to the sum of the present value of all after-tax cash
flows from the project minus the initial investment.

The investment decision using the NPV method will be based on whether the NPV is positive
or negative. The NPV will be positive if the present value of all future cash flows is higher
than the initial investment. A positive NPV indicates that the project is worth investing in. On
the other hand, a negative NPV indicates that investing in the project will not be wise.

The formula for calculating NPV is:

Example
A project requires an initial investment of $100 million and after that provides the following
cash flows in the next four years.
Year 1: $30 million
Year 2: $30 million
Year 3: $30 million
Year 4: $50 million
Assuming a required rate of return of 10%, the NPV of this project will be calculated as
follows:


19 | P a g e

Since the project has positive NPV, the project is considered worth investing in.
Calculator Usage
We can solve the NPV problems using the official BA II Plus calculator as follows:
Step 1: Enter Cash Flows
Before you enter the new cash flows, clear the previous work by pressing the keys CF,
2ND, CLR WORK.
Now you are ready to enter the cash flows. You should see CF0 on screen. Enter the
cash flows as follows:
Initial investment 100[+/-][ENTER] CF0 = -100
Period 1 Cash Flow [][30] CF1 = 30
Period 2 Cash Flow [][] [30] CF2 = 30
Period 3 Cash Flow [][] [30] CF3 = 30
Period 4 Cash Flow [][] [30] CF4 = 50
Step 2: Enter Interest Rate
Enter interest rate [NPV]10[ENTER] I = 10%
Step 3: Compute NPV
Compute NPV [] [CPT] NPV = 8.756

INTERNAL RATE OF RETURN
IRR is the case of a discount rate that equalizes the present value of cash inflows with present
value of cash outflows. Within the context of a net present value analysis, when the cash
inflows and outflows are known, IRR will be the rate that causes the NPV to equal zero.
The formula for IRR is as follows:

In our example, the IRR will be calculated as follows:

20 | P a g e


Solving the above equation, we get IRR = 13.663%.

This is solved by trial and error method. In Excel, you can use Goal Seek to solve for IRR.

Calculator Usage
We can solve the IRR problems using the official BA II Plus calculator as follows:
Step 1: Enter Cash Flows
Before you enter the new cash flows, clear the previous work by pressing the keys CF,
2ND, CLR WORK. Now you are ready to enter the cash flows. You should see CF0 on
screen. Enter the cash flows as follows:
Initial investment 100[+/-][ENTER] CF0 = -100
Period 1 Cash Flow [][30] CF1 = 30
Period 2 Cash Flow [][] [30] CF2 = 30
Period 3 Cash Flow [][] [30] CF3 = 30
Period 4 Cash Flow [][] [30] CF4 = 50
Step 2: Compute IRR
Compute IRR [IRR][CPT] NPV = 13.663

CONFLICT BETWEEN NPV AND IRR (AND PROBLEM WITH IRR)
When you are analyzing a single conventional project, both NPV and IRR will provide you
the same indicator about whether to accept the project or not. However, when comparing two
projects, the NPV and IRR may provide conflicting results. It may be so that one project has
higher NPV while the other has a higher IRR. This difference could occur because of the
different cash flow patterns in the two projects.
The following example illustrates this point.

21 | P a g e


Project A Project B
Year 0 -5000 -5000
Year 1 2000 0
Year 2 2000 0
Year 3 2000 0
Year 4 2000 0
Year 5 2000 15000
NPV $2,581.57 $4,313.82
IRR 29% 25%

The above example assumes a discount rate of 10%. As you can see, Project A has higher
IRR, while Project B has higher NPV.
If these two projects were independent, it wouldnt matter much because the firm can accept
both the projects. However, in case of mutually exclusive projects, the firm needs to decide
one of the two projects to invest in.
When facing such a situation, the project with a higher NPV should be chosen because there
is an inherent reinvestment assumption. In our calculation, there is an assumption that the
cash flows will be reinvested at the same discount rate at which they are discounted. In the
NPV calculation, the implicit assumption for reinvestment rate is 10%. In IRR, the implicit
reinvestment rate assumption is of 29% or 25%. The reinvestment rate of 29% or 25% in IRR
is quite unrealistic compared to NPV. This makes the NPV results superior to the IRR results.
In this example, project B should be chosen.

22 | P a g e

The above example illustrated the conflicting results of NPV and IRR due to differing cash
flow patterns. The conflicting results can also occur because of the size and investment of the
projects. A small project may have low NPV but higher IRR.


Project A Project B
Year 0 -5000 -20000
Year 1 2000 7000
Year 2 2000 7000
Year 3 2000 7000
Year 4 2000 7000
Year 5 2000 7000
NPV $2,581.57 $6,535.51
IRR 29% 22%

In this case, Project A has lower NPV compared to Project B but a higher IRR. Again, if
these were mutually exclusive projects, we should choose the one with higher NPV, that is,
project B.
HOLDING PERIOD RETURN (TOTAL RETURN)
For investments, the Holding Period Return (HPR) refers to the total return earned from an
investment or an investment portfolio over the holding period, that is, the period for which
the asset or portfolio was held by the investor. The holding period can be anything such as 1
day, 1 month, 6 months, 1 year, 5 years and so on.

23 | P a g e

If you buy an asset now at $100 and sell it at $120 after 2 years, the holding period return will
be (120 100)/100 = 20%. The time when the asset was bought can be labeled t and the
current time when the asset is sold can be labeled t+1. If the asset pays an y income such
as dividend income on maturity, then that should also be added to the total returns. If P
represents the price of the asset, then the holding period return formula can be presented as
follows:

Lets take a simple example to understand the HPR calculation.
Lets say that we purchased one share of a stock for $100 at the beginning of the year. After
three months, the stock price has gone up to $102 and it also pays a dividend of $2. The
holding period return will be:

The holding period returns can be annualized from either longer periods or shorter periods.
If the original HPR is calculated over multiple years, then the annualized returns can be
calculated as follows:

If the original HPR that we have are quarterly, then we can annualize them using the
following formula:

The same above formula can also be used if we had the annual returns and wanted to
calculate the holding period return for the multiple periods.
For example, lets say that our investment had a price appreciation of 10%, 8%, and -6% over
the three year period.
The HPR can be calculated as follows:


TIME-WEIGHTED RETURNS

24 | P a g e

While calculating the returns on financial assets, we will often look at the returns from
multiple holding periods. For example, one may hold an asset for five years, and the asset
may have earned total 150% returns over this period of 5 years. However, it is difficult to
interpret these returns as we cannot compare them with returns on other assets. For purpose
of comparison, we will have to aggregate these returns for the same period such as daily
returns, monthly returns, or yearly returns. It will be more suitable to calculate average
annual returns than to know the returns earned over 5 years.


While calculating the aggregate returns, our return measure will vary depending on what
method we use to calculate the aggregate returns. Two common methods are arithmetic
returns and geometric returns.
Lets take an example to understand both these methods. Lets say that our portfolio
generated the following returns in 5 years.
Arithmetic Returns
To calculate the arithmetic average, we take the simple average of the 5 yearly returns as
follows:

Geometric Returns
One problem with arithmetic mean is that it assumes the returns on the investment made at
the beginning of each period. So, for each period the beginning investment amount is
assumed to be the same. It ignores the compounding effect of investment returns made in the
previous years. Using arithmetic returns, our measure can be majorly flawed. Consider an
investment of $100 at the beginning. Say in first year the investment value rises to $200. The
returns are 100%.
In year 2, the investment falls back to $100, which will be a return of -50% in the year 2. If
we take the average of two year returns, i.e., 100% in year 1 and -50% in year 2, it shows an
average annual return of 25% on this investment, even though our investment value is back to
$100 (from where we started). This problem can be solved by calculating geometric returns
which incorporates the compounding effect.

25 | P a g e

In our example the geometric returns can be calculated as follows:


As you can see, geometric return is lower than the arithmetic return, and is a better method
for aggregating returns over multiple holding periods.
MONEY-WEIGHTED RETURNS
We learned about arithmetic returns and geometric returns. However, the problem with these
measures is that they do not consider the amount of investment made in each period. For
example, in the first year, we may have an investment of USD 5,000 while in the second year
the investment may only be $2,000. So, the returns when looked at along with how much
money was invested will make a huge difference to our actual return on investment. This is
called money-weighted return or internal rate of return.
Lets say we had the following investments and returns in the past 3 years:
In the first year, we made an investment of $1000, and we had a 100% return in the first year.
By the end of the year, our investment has grown to $2,000. Then at the beginning of the
second year we invested $2000 more making a total investment of $4000. The returns in the
second year were -50%, and our investment value reduced to $2000. Then assume we
withdrew $500 from the investment fund, leaving only $1500 invested. In the third year there
was no new investment, and our returns were 35%, making our investment grow to $2025.
The cash flows are shown in the table below.

The money-weighted returns can be calculated using the same formula as that of the Internal
rate of Return (IRR).

26 | P a g e


Our cash flows are as follows:
CF0 = -$1,000
CF1 = -$2,000
CF2 = +$500
CF4 = $2,025
Applying the above formula and solving for IRR we get:

This tells the investor about what she actually earned on the money invested for the entire
three year period. Note that this return is negative because a significantly large amount of
money was invested in the year of negative returns compared to other years.

HOW TO CALCULATE ANNUALIZED RETURNS
When we make investments, we invest our money in different assets and earn returns for
different periods of time. For example, an investment in a short-term Treasury bill will be for
3 months. We may invest in a stock and exit after a week for a few days. For the purpose of
making the returns on these different investments comparable, we need to annualize the
returns. So, all daily, weekly, monthly, or quarterly returns will be converted to annualized
returns. The process for annualizing the returns is as follows:
The basic idea is to compound the returns to an annual period. So, if we have monthly
returns, we know that there are 12 months in the year, similarly there are 52 weeks, 4
quarters, and 365 days. We compound our returns by the number of periods in the whole
year.

Lets take a few examples to understand this.
Example 1: Quarterly Returns
Lets say we have 5% quarterly returns. Since there are four quarters in a year, the annual
returns will be:

27 | P a g e

Annual returns = (1+0.05)
4
1 = 21.55%
Example 2: Monthly Returns
Lets say we have 2% monthly returns. Since there are 12 months in a year, the annual
returns will be:
Annual returns = (1+0.02)
12
1 = 26.8%
Example 3: Weekly Returns
Lets say we have 0.5% weekly returns. Since there are 52 weeks in a year, the annual returns
will be:
Annual returns = (1+0.005)
52
1 = 29.6%
Example 4: Daily Returns
Lets say we have 0.1% daily returns. Since there are 365 days in a year, the annual returns
will be:
Annual returns = (1+0.001)
365
1 = 44.02%
Example 5: 100 Days Returns
We can actually have returns for any number of days and convert them to annualized returns.
Lets say we have 6% returns over 100 days. The annual returns will be:
Annual returns = (1+0.06)
(365/100)
1 = 23.69%
Annualized returns however have one limitation they assume that we will be able to
reinvest the money at the same rate. However, this may not always be possible. If we earned
5% in a quarter there is no guarantee that we will be able to replicate these returns over the
next three quarters in the year.

YIELD MEASURES FOR MONEY MARKET INSTRUMENTS
In addition to issuing long term bonds, governments also issue short-term instruments such as
Treasury bills (T bills) of up to one year maturity. T-bills do not carry a coupon, but are sold
on a discount basis.
For example, the US Treasury, UK government and the French government are all active and
regular issuers of bills. These represent the highest quality money market instruments
available from a credit standpoint, and are used by researchers to measure short term risk-free
rates.
In the UK, T bills are issued by the Bank of England on behalf of the government, normally
on a weekly basis and normally by tender. Treasury bills can be issued for any term up to one
year but the tendency has been to issue for 3 or 6 month periods. At the tender the

28 | P a g e

prospective purchaser has to indicate the price he is prepared to pay. This price is a function
of the interest rate expected.
We will learn about the following yield measures:
Bank Discount Yield
Holding Period Yield (HPY)
Effective Annual Yield
Money Market Yield

Bank Discount Yield
T-bills are quoted on a bank discount yield basis. The bank discount yield is calculated using
the following formula:


Lets take an example. The quoted price for a 90-day T-bill is USD 975,342 with a face value
of USD 1 million.
The bank discount yield will be calculated as follows:
y
D
=(1,000,000 975,342)/1,000,000 * 360/90 = 9.86%
The quoted yield on a bank discount basis is not useful for the following reasons:
It is based on face value rather than actual dollar amount invested.
It is annualized according to a 360-day rather than a 365-day year. This makes it
difficult to compare T-bills with Treasury notes and bonds which pay interest on a
365 day basis.

29 | P a g e

While annualizing the yield, it ignores the compounding effect.

Holding Period Yield (HPY)
For investments, the Holding Period Yield (HPY) or Holding Period Return (HPR) refers to
the total return earned from an investment or an investment portfolio over the holding period,
that is, the period for which the asset or portfolio was held by the investor.

Where Pt represents the time when the asset was purchased and Tt+1 is the price received at
the time of sale of the asset. Income is any income earned from the asset such as interest.
In our T-bill example, the holding period return will be:

Note that for a T-bill there is no interest income since it is sold at a discount.
EFFECTIVE ANNUAL YIELD FOR MONEY MARKET INSTRUMENTS
For a money market instrument such as a T-bill, Effective Annual Yield is the annualized
value of the Holding Period Return and is calculated using the following formula:

In our T-bill example, the HPR was 2.53%. If the holding period was 90 days, we can
calculate the effective annual yield as follows:

Money Market Yield
The money market yield, also known as the CD-equivalent yield, allows the quoted yield on a
T-bill to be compared with an interest-bearing money market instrument. Money market yield
uses 360 day year for calculation.

30 | P a g e



In our T-bill example, we can calculate MMY as follows:
MMY = (360*9.86%)/(360 90*9.86%) = 10.12%
We can also calculate MMY from HPY as MMY is just the annualized HPY on a 360-day
year basis.
HPR = 2.53%
Days to maturity = 60
MMY = 2.53*360/90 = 10.11%
CONVERT ONE YIELD TO ANOTHER
If we have HPY, EAY or MMY, we can use it to convert it to the other two.
Continuing with our previous example, lets say the money market yield is 10.11% and the
holding period is 90 days. This is the annualized yield from the asset on a 360-day year basis
but it does not account for compounding. It uses simple interest.
First, lets calculate the holding period return. This is the actual return earned by the investor.
Since the investor held the asset for only 90 days, the HPY will be calculated as:
HPY = MMY*t/360 = 10.11%*90/360 = 2.53%

The effective annual yield (EAY) is the annualized yield on a 365-day basis that also
incorporates compounding. We can use HPY to calculate EAY as follows:
EAY = 1.0253
(365/90)
= 10.66%
Bond Equivalent Yield
In the bond market the convention is to annualize the semi-annual yield by simply doubling
it. So, if the semi-annual yield is 3%, the annual yield is calculated simply as 3% x 2 = 6%.
The annual yield so calculated is called the bond-equivalent yield (BEY).

31 | P a g e

This convention doesnt follow the time value of money rules where you would compound
the semi-annual yield to calculate the effective annual yield. Instead the doubling convention
is followed across the market.
A common question asked by students is Why have such a convention and why not instead
use effective annual yield? The answer to this question is that since its a convention
everybody uses it and therefore yields are comparable. It doesnt really affect performance or
comparison between bonds because everyone would have used the convention to quote the
yield. Conventions are usually made to make things simpler. In this case if someone tells you
that the bond-equivalent yield is 6%, you instantly know that semi-annual yield is 3%, which
you can use to perform any calculations or to calculate the effective annual yield if you
require it.
If the convention was to use effective annual yield, it may have been better, but does it really
matter? In fact there are many other limitations of YTM that far outweigh the problem of
BEY convention. So, my suggestion to you would be to just follow the convention and dont
fret over it. It is important however, that you use the convention correctly.
Note: To calculate the bond equivalent yield, we first need the semi-annual yield. For a semi-
annual coupon paying bond, we calculate this directly and double it to calculate the bond
equivalent yield.
However, for an annual coupon paying bond or for any asset with a shorter maturity, we first
convert the yield that we have into a semi-annual yield and then double it to calculate BEY.
For an annual-pay bond, BEY will be calculated as follows:
BEY = 2 x [(1+ yield on annual-pay bond)
0.5
-1]
For an instrument with a 3-month yield, BEY will be calculated as follows:
BEY = 2 x [(1+ yield on annual-pay bond)
2
-1]

CHAPTER 3
TABLE OF CONTENTS


32 | P a g e


Statistical Concepts and Market Returns
Lesson Topics
Descriptive Vs. Inferential Statistics
Types of Measurement Scales
Relative Frequencies and Cumulative Relative Frequencies
Properties of a Data Set (Histogram / Frequency Polygon)
Measures of Central Tendency
Calculating Arithmetic Mean
Calculating Weighted Average Mean
Calculating Geometric Mean
Calculating Harmonic Mean
Calculating Median and Mode of a Data Set
Quartiles, Quintiles, Deciles, and Percentiles
Range and Mean Absolute Deviation
Variance and Standard Deviation
Chebyshevs Inequality
Coefficient of Variation
Sharpe Ratio

Descriptive Vs. Inferential Statistics
Statistics is the science of analyzing data. When you are presented with the daily closing
prices of a stock for the past one year, how do you make sense of this data? Using the tools
and techniques offered by statistics, you can analyze the data in various ways. For example,
you can find out the average price of the stock over the past one year. You can also calculate
other statistics such as the dispersion of the stock prices around the mean. Statistics deals
with all aspects of data including collecting data, organizing data, analysing it, interpreting it
and presenting it in a useful forms.
All statistical methods can be classified as descriptive or inferential statistics.


33 | P a g e


Descriptive statistics refers to analysis of data in order to summarize the important
characteristics of data in a meaningful way. However, descriptive statistics does not allow us
to make any conclusions beyond the data. Two important types of descriptive statistics
include the Measures of Central Tendency and Measures of Dispersion.
For example, you may have the monthly savings data of 100 families and using that you can
calculate descriptive statistics such as average savings and the dispersion of savings in this
group of 100 families. However, descriptive statistics will describe the characteristics of only
this group of 100 families. This group of data that contains all the data that you are interested
in describing is called population. Another example of population is the returns of all stocks
trading on NASDAQ. Note that the size of the population does not matter. As long as the data
set, whether small or big, contains all the data that you are interested in, it represents your
population.
Inferential statistics uses the sample data to reach some conclusion about the characteristics
of the larger population. Using the same example of savings by families, we know that
descriptive statistics cannot be used to make any conclusions about any families other that the
100 families in our data group.
For example, what if you were interested in the savings pattern of an entire country, such as
the U.S. It may not be feasible or practical to collect the monthly savings data of every family
in the U.S. that constitutes your population. In that case, you will take a small sample of
families from across the U.S. that will be used to represent the larger population of U.S. You
will use this sample data to calculate its mean and standard deviation. We use inferential
statistics techniques to make conclusions or inferences about the population that the sample
represents. Two common methods of inferential statistics are Estimation of Parameters, and
Hypothesis Testing.
Types of Measurement Scales
Depending on the information we want the data to represent, we can choose one of the four
measurement scales.
Nominal Scale
Used to classify data
Observations are put into categories based on some criteria.
The category labels can be numbers but they dont have any numeric value.
Example 1: Classifying stocks as small-cap, mid-cap, and large-cap
Example 2: Classifying funds as equity funds, debt funds, and balanced funds.



34 | P a g e

Ordinal Scale
Used to classify and order (Ranking)
Observations are not just classified but also ordered
Example: Ranking top 10 stocks based on their P/E ratio
The numbers only represent the order. They do not say anything about how much
better or worse a stock is at a given number compared to one at a lower number.
Interval Scale
Used to classify and order with an equal interval scale
The intervals between adjacent scale values are equal.
Scale has an arbitrary zero point and as a result you cannot calculate ratios.
Example: Temperature scales. A temperature of 40 degrees is higher than 35 degrees
and is higher by 5 degrees.
The problem is that a temperature of 0 degrees does not imply absence of
temperature. Because of this, a temperature of 20 degrees does not necessarily mean
twice as hot as a temperature of 10 degrees.
Ratio Scale
All the above features along with an absolute zero.
Equal units of measurements and a rational zero point for the scale.
Example: Income of a group of people in dollars. If you have 0 dollars that means
complete absence of money (what we are measuring). However, if A has $10 and B
has $20, then B has twice as much money as A has.
Relative Frequencies and Cumulative Relative Frequencies
The data in a frequency distribution can also be presented using relative frequencies.

Once we have relative frequencies, we can calculate cumulative relative frequencies where as
we move from first frequency interval to the last, we keep adding the relative frequencies
finally reaching 100%. Cumulative relative frequencies are useful in measuring what fraction
of total observations are less than the upper limit of a frequency interval.
We will extend our example to show the relative frequencies and cumulative relative
frequencies.


35 | P a g e

Interval
Absolute
Frequency
Relative
Frequencies
Cumulative
Relative
Frequencies
0 <= r < 2 3 3/20 = 15% 15%
2 <= r < 4 5 5/20 = 25% 40%
4 <= r < 6 6 6/20 = 30% 70%
6 <= r < 8 2 2/20 = 10% 80%
8 <= r < 10 4 4/20 = 20% 100%
20 100%

The cumulative relative frequency is equal to the some of the relative frequencies of all the
previous intervals including the current interval. For example, the cumulative absolute
frequency for the interval 4 <= r < 6 is 15% + 25% + 30% = 70%.
Properties of a Data Set (Histogram / Frequency Polygon)
Histogram
The data in a frequency distribution can be presented using a histogram. A histogram is a bar
chart with different intervals on the X-axis and the absolute frequencies on the Y-axis. The
histogram for our data is presented below:







36 | P a g e



Frequency Polygon
A frequency polygon is similar to a histogram, except that the x-axis plots the mid-point for
each interval. Instead of bars, the neighboring points are connected by lines.
The interval mid points for our frequency intervals are 1, 3, 5, 7, and 9. The frequency
polygon will look as follows:


Measures of Central Tendency
The measures of central tendency identify what is the center of a data set. These are the most
widely used tools among all the statistical measures. The most commonly used measures of
central tendency include: arithmetic mean, geometric mean, weighted average mean, median
and mode.
We will use the following data set to calculate all these measures:
1.5, 2.5, 3, 2.3, 4.3, 5.6, 4.2, 6.7, 5.9, 1.2, 5.4, 9.8, 8.5, 5.5, 2.9, 1.7, 8.8, 6.2, 9.5, 3.8

37 | P a g e



Calculating Arithmetic Mean
Arithmetic mean is the simple average of all observations and is calculated by adding all the
observations and dividing it by the total number of observations.
We can calculate arithmetic mean for both the population and the sample.
Population Mean
This is the arithmetic mean of all observations in the population. The formula for population
mean is given below:

Sample Mean
This is the arithmetic mean of all observations in the sample of the population. The formula
for sample mean is given below:

Notice the difference in notations between the two formulas:
Xi represents the observations in both formulas.
The number of observations in the population is represented by capital N, while the number
of observations in the sample is represented by small n.
Example
Population Dataset: 1.5, 2.5, 3, 2.3, 4.3, 5.6, 4.2, 6.7, 5.9, 1.2, 5.4, 9.8, 8.5, 5.5, 2.9, 1.7, 8.8,
6.2, 9.5, 3.8
We can draw a sample from the above data set.
Sample Dataset: 2.5, 5.6, 1.2, 9.8, 8.8


38 | P a g e



Some observations:
As you can see, there is a lot of difference in population mean and sample mean. This
can happen for small data sets or if the sample is not drawn correctly.
Arithmetic mean is very sensitive to extreme values as a very large or small value can
significantly pull the mean on either side.
For arithmetic mean, the sum of deviations from the mean is always zero, i.e.,
Arithmetic mean is preferred over median and mode, as it uses all information about
the observations such as size and magnitude.
There can be only one arithmetic mean for a data set.
Calculating Weighted Average Mean
One characteristic of an arithmetic mean is that all observations have equal weight (=1/N).
However, this may not always be the case. In some cases, different observations may
influence the mean differently. This has special relevance in portfolios where a portfolio is
made up of different stocks each having a different weight.
Lets assume that we have a portfolio comprising three stocks, A, B and C as follows:
Stock Returns Weight
A 12% 20%
B 18% 30%
C 24% 50%

We have the stock returns for each stock and the weight of each stock in the portfolio. For
example, if the investor has a total of $1,000 invested in the portfolio, 20% or $200 is

39 | P a g e

invested in Stock A, $300 is invested in stock B, and the remaining $500 is invested in Stock
C.
The weighted average mean is calculated using the following formula:

The weighted mean of our portfolio will be calculated as follows:

Note that the weighted mean is closer to the returns from Stock C because Stock C has more
influence (weight) on the portfolio.
[Note usually the sum of all weights is 1 or 100%, hence the denominator is ignored]
Calculating Geometric Mean
One problem with arithmetic mean is that it assumes the returns on the investment made at
the beginning of each period. So, for each period the beginning investment amount is
assumed to be the same. It ignores the compounding effect of investment returns made in the
previous years. Using arithmetic returns, our measure can be majorly flawed. Consider an
investment of $100 at the beginning. Say in the first year the investment value rises to $200.
The returns are 100%. In year 2, the investment falls back to $100, which will be a return of -
50% in the year 2. If we take the average of two year returns, i.e., 100% in year 1 and -50%
in year 2, it shows an average annual return of 25% on this investment, even though our
investment value is back to $100 (from where we started). This problem can be solved by
calculating geometric returns which incorporates the compounding effect.
Lets taken an example to understand how geometric returns are calculated. Lets say our
portfolio generated the following returns in 5 years.

Geometric returns = [(1+100%)*(1-50%)*(1+35%)*(1-20%)*(1+50%)]^(1/5) 1 = 10%
The arithmetic mean on this portfolio would have been 23%.
As you can see, geometric return is lower than the arithmetic return, and is a better method
for aggregating returns over multiple holding periods.

40 | P a g e

We can say that geometric mean is a more suitable method for aggregating returns over a
period of time.
Calculating Harmonic Mean
Harmonic mean is calculated by dividing the number of observations (n) by the sum of
reciprocals of all observations.

Harmonic mean has some applications in finance. One application is to calculate the average
purchase cost of shares purchased over time.
Lets say that an investor purchased a stock worth $100 for two months. The share price at
the time of each purchase was 5 and 7. What will be the average purchase price? We can
calculate this as follows.
The number of stocks purchased in the two months are $100/5 = 20 and $100/7 = 14.286.
Total number of shares purchased is 34.286 for a total cost of $200. Average purchase price
will be = $200/34.286 = 5.833. This is in fact the harmonic mean.
We can use the harmonic mean formula to calculate this.

The relationship between Harmonic Mean, Arithmetic Mean, and Geometric Mean is as
given below:
Harmonic Mean < Geometric Mean < Arithmetic Mean

Calculating Median and Mode of a Data Set
Median
Median refers to the midpoint or the middle value of a data set after sorting the values in
ascending or descending order.
Our data set sorted in ascending order is presented below:
1.2, 1.5, 1.7, 2.3, 2.5, 2.9, 3, 3.8, 4.2, 4.3, 5.4, 5.5, 5.6, 5.9, 6.2, 6.7, 8.5, 8.8, 9.5, 9.8
Note that our data set has 20 values, so we dont have one middle point. There are two values
namely, 4.3 and 5.4. The median in this case will be the arithmetic mean of these two values.


41 | P a g e


Median = (4.3+5.4)/2 = 4.85
This will be the case for all data sets containing an even number of observations.
When we have an odd number of observations (such as 5, 7, or 9), the median is simply the
middle value. For example in case of 5 observations, the median will be the 3
rd
value.
Mode
Mode refers to the most frequently occurring value in a data set.
Lets look at our data set:
1.2, 1.5, 1.7, 2.3, 2.5, 2.9, 3, 3.8, 4.2, 4.3, 5.4, 5.5, 5.6, 5.9, 6.2, 6.7, 8.5, 8.8, 9.5, 9.8
All the values in this data set are different. This means that this data set has no mode.
Lets look at another data set:
1, 3, 4, 4, 5, 7
In this data set the value 4 occurs twice while the rest of the values occur only once. So, the
value 4 has the highest frequency of occurring. The mode of this data set is 4. Such a
distribution is called uni-modal.
If two values had occurred in the data set with highest frequency, the distribution will be
called bimodal. A data set having 3 values with highest frequency will be called trimodal, and
so on. For example, the following data set has value 4 and 5 occurring with highest
frequency.
1, 3, 4, 4, 5, 5, 7

Quartiles, Quintiles, Deciles, and Percentiles
A quantile refers to a value at or below which a stated fraction of the data lies. Quantile is a
general term, and we have different types of quantiles referring to different fractions.
Quartiles Divides the distribution into quarters.
Quintiles Divides the distribution into fifths.
Deciles Divides the distribution into tenths.

42 | P a g e

Percentiles Divides the distribution into hundredths.

Percentiles are the most commonly used quantiles and other quantiles are also expressed in
terms of percentiles.
For example, the first quartile is the value at or below which a quarter or 25 percent of the
values in the distribution lie. The second quartile will have 50 percent of values below it (50
percentile).
Example 1
Suppose we have a data set with 20 observations and we want to calculate third quartile or 75
percentile. Since the total number of observations (20) is divisible by 4, the calculation is
simple. The 15
th
value is the third quartile because 3/4
th
of the values lie at or below the
15
th
value.
Example 2
The calculation of a quartile may however be difficult, if the total number of observations is
not divisible by 4. For large data sets, this will not matter. However, for a small data set the
quartile value will only be an approximation.
Lets say our data set has 15 observations and we want to calculate the third quartile.
1.2, 1.5, 1.7, 2.3, 2.5, 2.9, 3, 3.8, 4.2, 4.3, 5.4, 5.5, 5.6, 5.9, 6.2
Since 15 is not divisible by 4, we cannot directly observe the third quartile. We will use a
slightly different method to calculate this.
We know that third quartile is 75 percentile.
We can calculate the position of an observation at a given percentile using the following
formula.

Where y is the percentile we want to find and n is the number of observations in a dataset. In
our data set, n = 15, and y = 75.
Ly = (15+1)*(75/100) = 12
The 12
th
observation is 5.5. That means 75 percent of observations lie below 5.5.
In this example Ly was a whole number. If it was not, we would have used linear
interpolation to find the quartile.


43 | P a g e


Range and Mean Absolute Deviation
In investment management, one of the most important things for an investor is the trade-off
between the returns and risk from an investment. The return or reward is measured using the
measures of central tendency while the risk is measured using the measures of dispersion.
The dispersion here refers to how the observations vary around the mean.
We will now look at the different types of measures of dispersion.
Range
Range is calculated as the difference between the highest value and the lowest value.
Range = Highest Value Lowest Value
Refer to our data set:
1.2, 1.5, 1.7, 2.3, 2.5, 2.9, 3, 3.8, 4.2, 4.3, 5.4, 5.5, 5.6, 5.9, 6.2, 6.7, 8.5, 8.8, 9.5, 9.8
The lowest value is 1.2 and the highest value is 9.8
The range will be given as:
Range = 9.8 1.2 = 8.6
Mean Absolute Deviation
Mean absolute deviation is calculated as the average of the absolute values of difference
between each observation and the arithmetic mean.

Note that we take the absolute value of differences. So, any negative signs are ignored.
Lets take the following data set of five values:
1.2, 1.5, 1.7, 2.3, 2.5


44 | P a g e

Both Range and Mean Absolute Deviation are not very commonly used measures are
dispersion. We will now look at the more robust measures, namely, variance and standard
deviation.
Variance and Standard Deviation
Risk is the possibility that actual returns might differ, or vary, from expected returns. In fact,
actual returns will most likely differ from expected returns. It is important for decision-
makers to estimate the magnitude and likelihood of the difference between actual and
estimated returns. After all, there is a big difference if your predictions result in an error of
only $100 versus an error of $1 million.
By using the concepts of variance and standard deviation, investors can judge not only how
wrong their estimates might be, but also estimate the likelihood, or probability, of favorable
or unfavorable outcomes. With the tools of expected return and standard deviation, financial
decision-makers are better able to evaluate alternative investments based on risk-return
tradeoffs, and their own risk preferences.
In general, the risk of an asset or a portfolio is measured in the form of the standard deviation
of the returns, where standard deviation is the square root of variance. Lets look at how
standard deviation and variance is calculated.
Lets say our observation data set is the returns data of a stock. Using this data, we calculate
the mean/average returns. The variance of the asset returns will then be the average of the
square of the difference between the returns and the mean.

The standard deviation will simply be the square root of the variance.

The following is a simple example that illustrates the calculation:

45 | P a g e


Note that in the above example we start with the stock price. Once we have the stock price
data, the first step is to calculate the returns. What kind of returns we have depends on the
periodicity of the data. For example, if we have daily prices, then we calculate the daily
return, which is calculated as (P(t1) P(t0) /P(t0).

Notations
The population variance and standard deviation are written as s
2
and s while the sample
variance and standard deviation are written as s
2
and s. For sample variance, we use n-1 in the
denominator instead of N. This is done because as per the theory, using the entire sample set
n, will underestimate the population variance.
Chebyshevs Inequality
Chebyshevs Inequality is used to describe the percentage of values in a distribution within an
interval centered at the mean.
It states that for a distribution, the percentage of observations that lie within k standard
deviations is at least 1 1/k
2







46 | P a g e


This is illustrated below:

Example
The following table shows the minimum number of observations that lie within a certain
number of standard deviations of the mean.
Standard Deviations % of observations
1.5 56%
2 75%
3 89%
4 94%

An important feature of Chebyshevs Inequality is that it works with any kind of distribution
(i.e., Normal, Poisson, Harmonic, Geometric, Bernoulli, Binomial, hyperbolic etc,).
Coefficient of Variation
We earlier learned about calculating the variance and standard deviation for a set of data.
Standard deviation as a measure of dispersion is much easier to interpret as it uses the same
unit of measurement as the data itself. However, standard deviation is not a good measure if
we are comparing the relative degree of variation of two sets of data or distributions. For this
purpose we have another measure called the coefficient of variation. The coefficient of
variation measures the degree of variation in a distribution relative to the mean of the
distribution.


47 | P a g e


Suppose we have two data sets A and B. The coefficient of variation for these data sets is
calculated as follows:

The coefficient of variations of two distributions can be compared to find out which one has
less dispersion per unit of risk.
Lets say standard deviation of A is 2% with a mean of 6% and standard deviation of B is
1.5% with a mean of 4%.

We can see that distribution A has less dispersion per unit of return than distribution B.
Sharpe Ratio
While deciding about what investments to make, one should weigh the rewards versus the
risks of the investment opportunity. The Sharpe ratio is one popular measure of return on
risk. It is named after Nobel Laureate professor William F. Sharpe.
The Sharpe ratio measures the reward (or excess return) of an asset per unit of risk.
The Sharpe ratio is also commonly expressed as:

Both the return and the standard deviation are annualized. To annualize returns, you multiply
linearly by time. For example, a monthly return of 1% converts to an annualized return of
12%. Standard deviation of return is a measure of risk, or uncertainty, of returns. To
annualize standard deviation, multiply by the square root of time.
For example, a monthly standard deviation of return of 1 % converts to annualized standard
deviation of 1 % x SQRT(12) = 3.46%.
A higher Sharpe ratio indicates better portfolio performance. Sharpe ratios can be increased
either by increasing returns or by decreasing risk.

48 | P a g e

As we know, a portfolio can achieve higher returns by taking on additional risks. Using the
Sharpe Ratio one can determine the source of higher returns: better performance or from
additional risks.
Historically, Sharpe ratios over long periods of time for most major asset classes have ranged
from 0.3 to 2.
Sharpe ratio has two limitations.
1. When the Sharpe ratio is positive, if we increase the risk, the ratio decreases. When
Sharpe ratio is negative, however, increasing the risk brings the Sharpe ratio closer to
zero, i.e., a higher Sharpe ratio. However, it doesnt mean better risk-adjusted
performance.
2. It considers only the standard deviation as a measure of risk. If the portfolio returns
are asymmetric such as strategies involving options, standard deviation is not a good
risk measure. For such a portfolio using standard deviation will overestimate Sharpe
ratio.

CHAPTER 4
TABLE OF CONTENTS

Probability Concepts
Lesson Topics
Two Defining Properties of Probability
Probability - Basic Terminology
Empirical, Subjective and Priori Probability
State the Probability of an Event as Odds
Unconditional and Conditional Probabilities
Multiplication, Addition and Total Probability Rules
Joint Probability of Two Events
Probability of At least One of the Events Occurring
Dependent Vs. Independent Events in Probability
Joint Probability of a Number of Independent Events
Unconditional Probability Using Total Probability Rule

49 | P a g e

Expected Value of Investments
Calculating Variance and Standard Deviation of Stock Returns
Conditional Expected Values
Calculating Covariance and Correlation
Expected Value of a Portfolio
Variance and Standard Deviation of a Portfolio
Bayes Theorem
Multiplication Rule of Counting
Permutation and Combination Formula

Two Defining Properties of Probability
There are two important properties of probability.
The probability of an event E is between 1 and 0, i.e., 0 < P(E) < 1.
The sum of probabilities of all mutually exclusive and exhaustive events is equal to 1.

These two properties together define probability.
When we roll a die, the events 1, 2, 3, 4, 5, and 6 are mutually exclusive and exhaustive.
The probability of any event occurring is between 0 and 1. The sum of probabilities of all
these 6 events is equal to 1.

Probability Basic Terminology
Before we learn about the probability concepts, it is important to know the basic terminology.
Random Variable
A random variable is one of the most important concepts in finance. A random variable is a
variable whose value is an outcome of a random phenomenon, for example, this can be
viewed as the outcome of throwing a die where the process is fixed but the outcome is not.
Random variables describe key things like asset returns. We then use distribution functions to
characterize the random variables.



50 | P a g e


Outcome
An outcome is a possible result of a probability experiment. For example, a portfolio earning
8% returns is an outcome.
An Event
An event is a single outcome [that has actually taken place] or set of outcomes to which we
assign a probability. For example, a portfolio earning 8% returns, or a portfolio earning
returns between 6-8%.
Mutually Exclusive Events
These are the events that are mutually exclusive, that is, they cannot happen at the same time.
For example, when we throw a die, if one event is to get number 3 and another event is to get
number 4 both these events are mutually exclusive; they cannot happen together.
Exhaustive Events
Mutually exhaustive events are those that include all possible outcomes. This means at least
one of the events must occur. For example, when we roll a die, the events 1, 2, 3, 4, 5, and 6
are mutually exhaustive, because they include the entire range of possible outcomes. This
example is actually mutually exclusive and exhaustive as apart from being all encompassing,
no two events can occur together.
Notations
We will use the following notations for probability throughout the reading.
E denotes an event
P(E) denotes the probability of an event
Empirical, Subjective and Priori Probability
There are three types of probabilities:
Empirical
Probability
Based on observed or historical data.
Subjective
Probability
Based on an individuals judgment about the probability
of occurrence of an event.
The probability of an event is determined by an
individual, based on that persons past experience,
personal opinion, and/or analysis of a particular

51 | P a g e

situation.
Priori
Probability
Based on prior knowledge.
Involves deductive reasoning.

State the Probability of an Event as Odds
The probabilities can also be stated in terms of odds. For example, odds for an event or odds
against an event.
Lets say that the probability of an event occurring is P(E). If we roll a die, the probability of
getting a 5 is P(5) = 1/6.
Odds for E
The odds for Event E to occur can be stated as follows:


In our example, the odds for getting a 5 are (1/6)/(1 1/6)= 1/5 or one-to-five.
Odds against E
The odds against Event E to occur can be stated as follows:

In our example, the odds against getting a 5 are (1 1/6)/(1/6)= 5/1 or 5-to-one.
The concept of odds doesnt have much relevance in finance and investments. It is more
commonly used in betting.
Unconditional and Conditional Probabilities
Lets say you are asked the following question:
What is the probability of your portfolio earning a return greater than 10%?
This kind of probability is an unconditional probability as the probability is not dependent
on the occurrence of any other event. The event, A, is that the portfolio will earn a return
greater than 10%. The probability of such an event will be specified as P(A). The calculation

52 | P a g e

is quite simple. The numerator is the sum of probabilities of all returns being above 10%.
Assume this is 0.60. The denominator is 1, the sum of probabilities of all possible returns.
The probability P(A) = 0.60/1 = 0.60.
Now, lets ask another related question:
What is the probability of your portfolio earning a return greater than 10% given that the
returns are never below 5%?
Notice that we have added a new condition given that the returns are never below 5%. Now
the probability of portfolio earning returns greater than 10% is not unconditional. It is
conditional on another event, B, that is, the returns are never below 5%. Such a probability is
called conditional probability, and is expresses as P(A|B), the probability of A given B.
Our calculation will now change.
The numerator will still be the same: the sum of probabilities of all returns being above 10%.
We assumed this to be 0.60.
The denominator will now consider Event B as well the sum of probabilities of all returns
being 5% or more. Assume this is 0.80.
The conditional probability will be calculated as P(A|B) = 0.60/0/80 = 0.75.
[This formula is read as Probability of event A subject to that event B has already occurred]

Multiplication, Addition and Total Probability Rules

Addition Rule
The additional rule determines the probability of atleast one of the events occuring.

If A and B are mutually exclusive, then P(A and B) = 0, so the rule can be simplified as
follows:

Multiplication Rule
Multiplication rule determines the joint probability of two events.



53 | P a g e


Joint probability of A and B is equal to the probability of A given B multiplied by the
probability of B.
If A and B are independent, then P (A/B) = P (A)and the multiplication rule simplifies to:

Total Probability Rule
The total probability rule determines the unconditional probability of an event in terms of
probabilities conditional on scenarios.

Lets take an example to understand this.
Event A: Company Xs stock price will rise.
Event B: Inflation will fall. P(B) = 0.6. Therefore, probability of inflation not falling, P(B
C
) =
0.4
Probability of stock price rising given a fall in inflation, P(A|B) = 0.8
Probability of stock price rising given no fall in inflation, P(A|B
C
) = 0.6
We can use the total probability rule to calculate the probability of a rise in stock price as
follows:

This is the total probability of event A occurring under all scenarios.
Joint Probability of Two Events
Joint probability refers to the multiplication rule of probability. This is the probability that
both the events will occur.

Example
The probability that the price of oil will rise, P(B) = 0.5
The probability that the bus fare will increase if oil price rises, P(A|B) = 0.4

54 | P a g e

The probability that both oil prices and bus fares will rise, P(AB) = 0.4*0.5 = 0.2
This may look complex but the logic is actually quite straight forward. There is a 50% chance
that oil price will rise and if it rises there is a 40% chance that the bus fair will also rise. So,
the joint probability of both oil price rise and bus fare rise is 50% of 40%, i.e., 0.5*0.4 = 0.20
or 20%.
PROBABILITY OF AT LEAST ONE OF THE EVENTS OCCURRING
This refers to the addition rule.
The additional rule determines the probability of at least one of the events occurring.
P (A or B) = P (A) + P (B) P (A and B)
If A and B are mutually exclusive, then P(A and B) = 0, so the rule can be simplified as
follows:
P (A or B) = P (A) + P (B) for mutually exclusive events A and B.
Example
An investor is contemplating buying one of the two stocks A or B. The probability P(A) that
an investor will buy stock A is 0.30. The probability P(B) that the investor will buy stock B is
0.50. The probability that he may buy both P(A and B) is 0.10.
The probability that the investor will buy at least one of the two stocks (A, or B, or both) is
calculated as follows:
P(A or B) = 0.30 + 0.50 0.10 = 0.70
Suppose P(A) and P(B) are mutually exclusive events, that is, the investor will buy only one
of the two stocks, then:
P (A or B) = 0.30 + 0.50 = 0.80
Dependent Vs. Independent Events in Probability
Two events are said to be independent if the occurrence of one event is in no way affected by
the occurrence of the other event. Suppose we roll a die and receive a 6. The second time we
roll the die, its outcome will not be affected by the fact that we received a 6 in the first roll.
The outcome of each roll is independent of each other.
To be independent, one of the following conditions must be true:
P (A | B) = P(A) or P(B | A) = P(B)




55 | P a g e


If the two events are not independent, then they are said to be dependent, that is, the
occurrence of one event influences the occurrence of another event.
Joint Probability of a Number of Independent Events
If two events are independent, then the joint probability of these two independent events is
calculated as:
P(A and B) = P(A) x P(B)
Example
Suppose we roll two dice. The joint probability of getting a 1 on first die and a 6 on the
second die is given as follows:
Probability of getting a 1 on first die, P(A) = 1/6
Probability of getting a 6 on second die, P(B) = 1/6
P(A and B) = 1/6 * 1/6 = 1/36 = 0.0278
The same rule can be applied to calculate the joint probability of any number of independent
events. For example, if there are three independent events, A, B and C, their joint probability
will be:
P(A and B and C) = P(A) x P(B) x P(C)

Unconditional Probability Using Total Probability Rule
As we learned earlier, the total probability rule determines the unconditional probability of an
event in terms of probabilities conditional on scenarios.
P(A) = P(A | S1)P(S1) + P(A | S2)P(S2) + + P(A | Sn)P(Sn)
Where the scenarios S1, S2, Sn are mutually exclusive and exhaustive.
Lets take one more example of the Total Probability Rule.
An analyst is assessing the performance of a stock under different scenarios. He comes up
with the following probabilities.
State of
Economy
Probability of
Economic State
Stock
Performance
Probability
No recession
0.60
Rise P(SR |
0.70

56 | P a g e

P(R
C
) R
C
)
Fall P(SR
C
|
R
C
)
0.30
Recession P(R) 0.40
Rise P(SR |
R)
0.20
Fall P(SR
C
|
R)
0.80

Question 1
Based on the above data, what is the total probability of a stock rise? We need to find the
unconditional probability of a stock rise under all scenarios.
P(SR) = P(SR | R
C
) P(R
C
) + P(SR | R) P(R)
= 0.70*0.60 + 0.20*0.40 = 0.5
Question 2
What is the joint probability of having a recession and at the same time having a stock price
fall?
P(R and SR
C
) = P(SR
C
| R)x P(R) = 0.8*0.4 = 0.32

Expected Value of Investments
Expected value is an important concept in investments. An investor will make use of
expected value to estimate the expected returns from their portfolio or to assess other factors
such as financial ratios.
We can use a random variable to describe asset returns. The expected value of a random
variable is defined as the weighted average of all possible outcomes of the random variable.
The weights are the probabilities of each outcome.
Lets say we have a random variable X. Its expected value can be represented as follows:
E(X) = P(x1) x1 + P(x2) x2 + + P(xn) xn


57 | P a g e


Where,
E(X) is the expected value of the random variable
P(xi) is the probability of each observation
Xi represents an observed value of a random variable.
In terms of investments, expected returns from an asset can be represented as E(R).
Lets say an investor is analyzing the performance of a stock under different states of
economy and comes up with the following:
State of Economy Probability Return on Stock
1 0.20 15%
2 0.20 -5%
3 0.20 5%
4 0.20 35%
5 0.20 25%

The expected returns from this stock can be calculated as follows:
E(R) = 0.20*15%+0.20*(-5%)+0.20*5%+0.20*35%+0.20*25% = 15%

Calculating Variance and Standard Deviation of Stock Returns
We can also calculate the variance and standard deviation of the stock returns. The variance
will be calculated as the weighted sum of the square of differences between each outcome
and the expected returns.

58 | P a g e


The standard deviation will be:

Remember that the units of measuring standard deviation are the same as the units of
measuring stock returns, in this case percentage (%).

Conditional Expected Values
We can use the concept of conditional probabilities to arrive at the conditional expected
values. Conditional expected values are conditional based on another event. For example,
expected value of a random variable X given scenario S. A practical example would be to
find the expected returns from a stock given rising inflation.
The total probability rule can be stated in terms of the expected values as follows:
E(X) = E(X | S1)P(S1) + E(X | S2)P(S2) + + E(X | Sn)P(Sn)
Where the scenarios S1, S2, Sn are mutually exclusive and exhaustive.
Example Using Tree Diagram
The following diagram shows the returns from a stock under different inflationary scenarios.








59 | P a g e


Conditional Expected Value
The diagram shows that the probability of high inflation is 0.70 and the probability of low
inflation is 0.30. Given high inflation, the probability of getting a return of 8% is 0.25 and
probability of getting a return of 7% is 0.75. Given low inflation, the probability of getting a
return of 6% is 0.40 and probability of getting a return of 5% is 0.60.
In the above tree diagram, the values in green are calculated values.
The joint probability of 8% return and high inflation is = 0.25*0.70 = 0.175
The joint probability of 7% return and high inflation is = 0.75*0.70 = 0.525
The joint probability of 6% return and low inflation is = 0.40*0.30 = 0.12
The joint probability of 5% return and low inflation is = 0.60*0.30 = 0.18
The expected return will be calculated as follows:
E(R) = 0.175*8%+0.525*7%+0.12*6%+0.18*5% = 6.695%

Calculating Covariance and Correlation
Both correlation and covariance are indicators of the relationship between two variables.
They indicate whether the variables are positively or negatively related, i.e., how they move
together. For example, what is the relationship between the performance of gold and S&P
500 index?



60 | P a g e

Covariance measures the co-movement between two variables i.e. the amount by which the
two random variables show movement or change together.
The correlation also indicates the degree to which the two variables are related. Its a
translation of covariance into a unit-less measure that we can understand (-1.0 to 1.0). The
correlation of the variable with itself is always 1.
Calculating Covariance and Correlation
Covariance is calculated using the following formula:

Where R
i
and R
j
represent the returns on two assets, i and j.
The following example illustrates the calculation of expected returns:
State of
Economy

R
A
R
B

R
A
-
E(R
A
)
R
B
-
E(R
B
)
P*( R
A
-
E(R
A
))(
R
B
-
E(R
B
))
1 10% -10% 5%
-
26.00%
-
9.00%
0.234%
2 30% 15% 12% -1.00%
-
2.00%
0.006%
3 30% 18% 19% 2.00% 5.00% 0.030%
4 20% 22% 15% 6.00% 1.00% 0.012%
5 10% 27% 12% 11.00%
-
2.00%
-0.022%
E(RA)
=16%
E(RB)=14%
Cov =
0.0026

Once we have the covariance, we can calculate the correlation as follows:


61 | P a g e

We already have the covariance. The standard deviations of the two returns can be calculated
using the formulas learned under section Variance and Standard Deviation.
Std (R
i
) = 9.40%
Std (R
j
) = 4.17%
Correlation = 0.0026/(9.4%*4.17%) = 0.66
Refer to the spread sheet Covariance-Correlation.xlsx for detailed calculations.
Covariance Interpreted
In financial markets covariance is positive when the variables show similar behavior i.e.
larger values of one variable correspond to larger values of another variable and the same
holds true for smaller values. When the covariance is negative it means the exact opposite
i.e. larger values of one variable correspond to smaller values of another variable.
The strength of the linear relationship however cannot be easily interpreted by the magnitude
of the calculated value. In order to interpret the strength a related measure called correlation
is used.
Correlation Interpreted
The correlation number would always be in the range of -1 to +1. A value of 1 means that the
variables always move in the same direction and a value of -1 means the two always move in
the opposite direction. In the case where the variables are independent the covariance is zero
which means the correlation is also zero. In other words the two variables do not exhibit any
movement relative to each other. Any number in between indicates that the one number
moves less positively or negatively in relation to changes in another number.
Expected Value of a Portfolio [ Simple Markowitz theory]
We earlier learned about how to calculate the expected value, variance, and standard
deviation of a single random variable or an asset.
Portfolio managers will have many assets in their portfolios in different proportions. The
portfolio manager will have to therefore calculate the returns on the entire portfolio of assets.
The returns on the portfolio are calculated as the weighted average of the returns on all the
assets held in the portfolio.
Using the same properties, we can calculate the expected value (returns), variance and
standard deviation of a portfolio.
Expected Value (Expected returns)
The formula for portfolio returns is presented below:


62 | P a g e


Here w represents the weights of each asset, and r represents the returns on the assets. For
example, if an asset constitutes 25% of the portfolio, its weight will be 0.25. Note that sum of
all the asset weights will be equal to 1, as it will represent 100% of the investment. The
returns here are single period returns with same periods for each assets returns.
Lets take an example of a two asset portfolio to understand how portfolio returns are
calculated. Lets say that our portfolio comprises of two assets A and B and has the following
details.
Investment Returns
A 25000 10%
B 75000 6%
The table presents the amount invested in each asset and the returns from each asset. The
total amount invested is $100,000. We can calculate the weights for each asset as follows:
w
A
= 25000/100000 = 0.25
w
B
= 75000/100000 = 0.75
We can now calculate the portfolio returns as follows:

The same calculation can be extended for multiple assets.

Variance and Standard Deviation of a Portfolio
We learned about how to calculate the standard deviation of a single asset. Lets now look at
how to calculate the standard deviation of a portfolio with two or more assets.
The returns of the portfolio were simply the weighted average of returns of all assets in the
portfolio. However, the calculation of the risk/standard deviation is not the same. While
calculating the variance, we also need to consider the covariance between the assets in the
portfolio. If the assets are perfectly correlated, then the simple weighted average of variances
will work. However, when we have to account for the covariance, the equation will change.



63 | P a g e


Covariance reflects the degree to which two securities vary or change together, and is
represented as Cov (Ri,Rj). The problem with covariance is that it has no units, and is
difficult to compare across assets. Using covariance, we can calculate the correlation between
the assets using the following formula:

Correlation
After incorporating covariance, the standard deviation of a two-asset portfolio can be
calculated as follows:

Standard Deviation of a Two Asset Portfolio
In general as the correlation reduces, the risk of the portfolio reduces due to the
diversification benefits. Two assets that are perfectly negatively correlated provide the
maximum diversification benefit and hence minimize the risk.
Example
Assume we have a portfolio with the following details:

The standard deviation can be calculated as follows:


64 | P a g e



This portfolio has an expected return of 16.80% and a portfolio risk of 11.09%.
Bayes Theorem
Bayes Theorem formula, also known as Bayes Law, or Bayes Rule, is an intuitive idea. We
adjust our perspective (the probability set) given new, relevant information. Formally, Bayes
Theorem helps us move from an unconditional probability (what are the odds the economy
will grow?) to a conditional probability (given new evidence, what are the odds the economy
will grow?)
Suppose your daughter tells you that her friend is coming home tomorrow. Since you dont
know anything else, there is a 50% chance that the friend is a female. Now she tells you that
her friend has long hair. With this additional information there are now more chances that the
friend is a female. Bayes theorem can be applied in such scenarios to calculate the
probability (probability that the friend is a female.)
A simple representation of Bayes formula is as follows:

Example 1
The following information is available regarding drug testing.
0.5% of people are drug users
A test has 99% accuracy. 99% of drug users and 99% of non-drug users are correctly
identified by it.
The problem question is to find the probability of being a drug user if youve tested positive?





65 | P a g e

The solution according to Bayes theorem is as follows:
P(pos|user)=0.99 (99% effective at detecting users)
p(user)=0.005 the probability of a number of people being drug users
p(pos)=0.01*0.995+0.99*0.005 = 0.0149 which is deduced from the following
details: 1% chance of non-users, 99.5% of the population, to be tested positive, plus
99% chance of the users, 0.5% of the population, to be tested positive.
P(user/pos) = 0.99*0.005/0.0149 = 0.33
The answer we arrive at is that there is only a 33% chance that a positive test is correct
In this example some information is available about the proportions of users versus non-users
but in practice such information may not be available or determined.
Example 2
A company expects that there is a 5% probability that the economy will expand. Furthermore,
there is a 90% probability that the companys revenue will rise if the economy expands. If the
economy does not expand, there is only 40% probability that companys revenue will rise.
What is the probability that the economy has expanded given that the companys revenue has
risen.
We want to find out:
P (Economy Expansion | Company Revenue Rises) = P (EE | RR)
P(EE | RR) = P (RR | EE) * P(EE)/P(RR)
P (Economy Expansion) = P(EE) = 0.05
P (Revenue Rise | Economy Expansion) = P(RR | EE) = 0.90
P (Revenue Rise) = P(RR) = 0.05*0.90 + 0.95*0.40 = 0.425
P(EE | RR) = 0.90*0.05/0.425 = 0.106
If the companys revenue has risen, then there is a 10.6% probability that the economy has
expanded.

Multiplication Rule of Counting
Counting problems have to do with counting the total number of outcomes or logical
possibilities of something. For example, if we have to flip a coin, we can easily count the
number of outcomes. There are only two possible outcomes, either heads or tails. However,
as our problem or data set becomes large and complex, so does the total number of possible
outcomes. Counting the outcomes one by one may not be possible then and we will have to

66 | P a g e

use some techniques to make our job easy. In this section we will look at a few such
techniques.
Multiplication Rule of Counting
Problem 1
If there are A ways of doing something and B ways of doing another thing, then the total
number of ways to do both the things is = A x B.
For example, assume that your investment process involves two steps. The first step can be
done in two ways and the second step can be done in three ways. There are total 23 = 6
ways of carrying out both the steps.
Problem 2: Arranging Items in a group.
Suppose we have a group of 5 people and we want them to stand in a queue. In the queue the
first position can be filled in 5 ways. Now position 1 is filled and we have 4 more people left.
The second position can be filled in 4 ways. Similarly, third position can be filled in 3 ways
and so on. The total no. of ways these 5 positions can be filled is:
= 5 * 4 * 3 * 2 * 1 = 120
If the number of people was n, then this can be written as
n! = n (n-1)(n-2)(n-3)1
n! is known as factorial.
Solving n factorial using BA II Plus calculator
Suppose you want to calculate 5!. To solve this on your calculator, press 5[2ND]x!.
Multinomial Formula (General Formula for Labelling)
The factorial formula above assumed only one group. However, we may have labelling
problems with multiple groups. For example, suppose that we have a group of 10 stocks and
we want to label four of these stocks as BUY, three stocks as SELL and 3 stocks as HOLD.
What is the total number of ways to do this?
This problem can be solved using the general formula for labelling.

We have k different labels, n1, n2,nk of each type.
In our example,
n1 = BUY = 4

67 | P a g e

n2 = SELL = 3
n3 = HOLD = 3
So, the 10 stocks can be labeled in number of ways:

Permutation and Combination Formula
Combination Formula
This is a special case of multinomial formula where the types of labels k=2. This means that
the n objects can be labeled only in two ways and n1 + n2 = n.
For example, suppose we had to label 4 of our 10 stocks as BUY and the remaining 6 as
SELL.
So, n=10, n1 = 4 and n2 = 6.
Lets say n1 = r = 4, in that case n2 can be rewritten as n2 = n r or 10 4 = 6
We can rewrite our formula as follows:

This is called the combination formula and is read as n combination r, i.e., how many ways
can we select a group of size r from a group of n objects.

The combination problems can be solved directly on your BA II Plus calculator using the nCr
function.
The Combination formula has its application in binomial trees.
Permutation Formula
Note that in combinations, the order in which the objects are listed does not matter, that is A,
B is the same as B, A. However, there could be a situation where the order matters. For
example, from our group of 10 stocks, we want to select 4 stocks and rank them as No. 1, 2,
3, and 4. To solve this problem, we need to use the permutation formula which accounts for
ordering of objects.


68 | P a g e

This is read as the number of permutations of r objects from total n objects.
In our example:

There are 5040 ways of selecting 4 objects from a group of 10 objects when ordering of
objects is important.


CHAPTER 5
TABLE OF CONTENTS
Common Probability Distributions
Introduces some of the discrete and continuous probability distributions most commonly used
to describe the behavior of random variables.
Lesson Topics
What is a Probability Distribution
Discrete Vs. Continuous Random Variable
Cumulative Distribution Function
Discrete Uniform Random Variable
Bernoulli and Binomial Distribution









69 | P a g e


What is a Probability Distribution
We know that a random variable is an uncertain quantity or a number. Its value is determined
by chance. For example, the outcome of rolling a die is random. We could get any number
from 1 to 6. In case of a die, the probability of getting any number is 1/6. Each outcome has
the same probability. However, the probability of each outcome could be different.
A probability distribution is a graph or a table that describes the probabilities of each
outcome of a random variable.
In a probability distribution, each value or outcome of the random variable is represented as
x. The probability of getting x is represented as P(x). So, if X is the random variable, we are
saying that the probability of random variable X being equal to x is P(X=x) or P(x). This is
called the probability function.
The probability distribution of rolling a die is shown below:
xi P(xi)
1 1/6
2 1/6
3 1/6
4 1/6
5 1/6
6 1/6

Note that the sum of all probabilities should be equal to 1 and the probability of each
outcome, P(x) is between 0 and 1.



70 | P a g e

Discrete Vs. Continuous Random Variable
Discrete Random Variable
A random variable is said to be discrete if the total number of values it can take can be
counted. Alternatively, we can say that a discrete random variable can take only a discrete
countable value such as 1, 2, 3, 4, etc. For example, in case of the roll of a die, there could be
only 6 outcomes. This is an example of a discrete random variable. Similarly, if we have to
pick a stock from S&P 500 index, its a discrete random variable. Each outcome should also
have a positive probability.
Lets say a variable X can take values 1, 2, 3, 4. The probabilities of each of these outcomes
are given below:
xi P(xi)
1 0.2
2 0.3
3 0.4
4 0.1
We can draw this distribution in the form of a histogram.



71 | P a g e


Note: What would be the probability of the random variable X being equal to 5?
P(5) = 0 because as per our definition the random variable X can only take values, 1, 2, 3 and
4.
Continuous Random Variable
In contrast to discrete random variable, a random variable will be called continuous if it can
take an infinite number of values between the possible values for the random variable.
Examples include measuring the height of a person, or the amount of rain fall that a city
receives. The number of possible outcomes is infinite. In that case, what is the probability
that the random variable X will get a certain value x?
P(x) will be 0 because we are talking about the possibility of one outcome from an infinite
number of outcomes.
In finance, some variables such as price change of a stock, or the returns earned by an
investor are considered to be continuous, even though they are actually discrete, because the
number of possible outcomes is large, and the probability of each outcome is very small. For
example, the probability of an investor earning a return of exactly 8.25% is almost zero.
The probability distribution of a continuous random variable is called probability density
function.
Cumulative Distribution Function
We can also construct a cumulative distribution function for a random variable. A cumulative
distribution function gives the probability that the random variable X is less than or equal to
x, for every value x. In case of discrete random variables, the cumulative distribution function
is the sum of the probabilities of all outcomes unto and including the specific outcome x.
The cumulative distribution function is expressed as .
We will build upon our earlier probability distribution example.

xi P(xi) F(xi)
1 0.2 0.2
2 0.3 0.5

72 | P a g e

3 0.4 0.9
4 0.1 1

Probability that X =1 is 0.2
Probability that X = 1 or 2 = 0.2 + 0.3 = 0.5
Probability that X = 1 or 2 or 3 = 0.2 + 0.3 +0.4 = 0.9
Probability that X = 1 or 2 or 3 or 4= 0.2 + 0.3 +0.4 +0.1= 1.0
The histogram for cumulative distribution will look as follows:

The above cumulative distribution was for a discrete random variable. Even a continuous
random variable will have a cumulative distribution function.

Discrete Uniform Random Variable
A discrete uniform random variable is a discrete random variable for which the probability of
each outcome is the same.
Example
The roll of a die is a discrete uniform random variable and has a discrete uniform probability
distribution.


73 | P a g e


The random variable X can take the values X = {1, 2, 3, 4, 5, 6}
Each outcome has a probability of 1/6.
The probability distribution and cumulative distribution functions are shown below:
xi
Probability distribution,
P(xi)
Cumulative Distribution,
F(xi)
1 1/6 1/6 = 0.1667
2 1/6 2/6 = 0.333
3 1/6 3/6 = 0.5
4 1/6 4/6 = 0.667
5 1/6 5/6 = 0.833
6 1/6 6/6 = 1



74 | P a g e


Lets observe a few values from the above table.
P(3) = 1/6 or 0.1667
F(3) = 0.5
P(2<=X<=5) = 4*1/6 = 0.667
We can generalize this as follows:
P(x) = 1/6 or 0.1667
Cumulative distribution function for any outcome i is F(xi) = i*P(x)
Probability function for a range with k outcomes = k*P(x)

Bernoulli and Binomial Distribution
A Bernoulli random variable is a random variable that takes a value of 1 in case of a success
and a value of 0 in case of a failure. We can also say that this random variable has a Bernoulli
distribution. A classic example is a single toss of coin. When we toss a coin, the outcome can
be heads (success) with a probability p or tails (failure) with a probability of (1 p). The
important point here is a single toss of coin.
Now suppose we perform n number of trials. Each trial is independent and will result in a
success with a probability p or a failure with a probability (1-p). From the n trials, suppose X
represents the number of successes. Then X is a binomial random variable with parameters
(n, p). Note that Bernoulli random variable is a special case of binomial random variable with
parameters (1, p). The variable X will have a binomial distribution.
The binomial distribution has the following characteristics:
For each trial there are only two possible outcomes, success or failure.
Probability of success, p, of each trial is fixed.
There are n trials.
Each trial is independent
The binomial probability function defines the probability of x successes from n trials.
The binomial probability function is given using the following formula.



75 | P a g e


Lets take an example to understand how this can be applied.
You have a pool of stocks having returns either above 5% or below 5%. The probability of
selecting a stock with above 5% returns is 0.70. You are going to pick up 5 stocks. Assuming
binomial distribution, what is the probability of picking 2 stocks with above 5% returns?
Lets define our problem.
Success = Pick stock with above 5% returns
p = 0.70
n = 5
x = 2

Expected Value and Variance of a Binomial Distribution
For a binomial distribution, the expected value and variance are given as below:

Properties of Bernoulli Distribution
Definition
The Bernoulli distribution is a discrete probability distribution which consists of Bernoulli
trials.
Each Bernoulli trial has the following characteristics:
There are only two outcomes a 1 or 0, i.e., success or failure each time.
If the probability of success is p then the probability of failure is 1-p and this remains
the same across each successive trial.
The probabilities are not affected by the outcomes of other trials which means the
trials are independent.

76 | P a g e

The probability mass function commonly abbreviated as by PMF is given by:
The graph of the Bernoulli probabilities is as
depicted below:

An example that best illustrates the Bernoulli distribution is a single toss of a coin. Every
successive toss is independent of the previous tosses when it comes to determining the
outcome.
In case there are more than one trial or in case of many trials the Bernoulli distribution
extends to the Binomial distribution. This is then called a Binomial experiment and gives rise
to a binomial random variable.
The key characteristics of a binomial experiment are as follows:
A number of Bernoulli trials are to be performed under one experiment and these are
pre-determined.
Each trial has an experiment whereby there are two labelled outcomes success or
failure,
The probability of success in each trial is the same.
The trials are independent of each other which mean that one trials outcome is not
affected by the outcome of any other trial.


77 | P a g e


History and relevance
Swiss scientist Jacob Bernoulli is accredited with the invention of this distribution and
he also came up with the idea of the Binomial distribution.
It is used in situations where a random variable is associated with two outcomes.
Properties
The expected value of the random variable is given by E(x) = p and can be derived as follows
E(x) = 0*(1-p) + 1 * p = p
The variance of the Bernoulli variable is given by p*(1-p) and is given as follows:
Var(X) = p p
2
= p*(1-p)
Applications
There are real life situations which involve noting if a specific event occurs or not
which is recorded as a success or a failure. The Bernoulli distribution finds
application in this situation.
Some examples that best explain such scenarios are the probability of getting a head
in a single coin flip, probability of having a boy child or the probability of getting a
hike in the salary package.

Normal Distribution
Definition
It is one of the most important continuous probability distributions which finds wide
applications in real life by describing variables that display randomness. The distribution is
characterized by bell curve which is more weight in the center and tapers off on either side
which means it has tails on either side. It takes only two moments i.e. the mean and the
variance
2
to describe this function and is therefore a parametric function.
The mean gives more information about the location and the variance gives an idea of how
dispersed the values are.


The density function is given as follows:

78 | P a g e


Standard Normal Distribution
If the mean of the normal distribution is 0 and the variance is set to 1 then we get the standard
normal density distribution function which is symmetrical around the mean. In other words
the mean and mode of this function are the same as the median. The skewness of this
distribution is zero and the kurtosis has a value of 3. The variable which represents this
distribution is called the standard normal variable .
The standard normal distribution is depicted by the graph below:

The area under this graph for the range of values of from -2 to +2 represents 95% of the
distribution. For the value range from -1 to +1 the area represents 68% of the distribution. A
table that gives the values of the standard normal variable against the percentage of the
distribution is called the standard normal table and the percentage number is called the
confidence level in finance.
Most financial variables display this characteristic and the distribution plays a key role in this
field. One of the most famous applications is in the Black Scholes pricing model where the
cumulative standard normal density function is used to derive the prices of options.
The normal variable is derived from the standard normal by the following definition:



And this has the desired moments.

79 | P a g e

Value at Risk (VAR)
The maximum possible loss of a portfolio given the exposure to movement in market factors
is given by a multiplication of the standard deviation and the confidence level which is the
standard normal deviate factor.
This is given by the following equation:

Q(X, c) represents the quantile at a specified confidence level c.
Scalability
An important property of the normal distribution is that a linear combination of jointly
normally distributed random variables is normally distributed in other words it shows
stability under addition. This is of extreme relevance because it is important to know the
portfolio mean and variance to reconstruct the distribution.
Parametric vs Non-Parametric Distributions
Definitions
Parametric Distribution: A parametric distribution is used in statistics when an assumption
is made of the way the underlying data is distributed. An example would be when a variable
is assumed to be normally distributed. All subsequent analysis will then rely on this
assumption. The parameters associated with this assumption like mean and standard deviation
also contribute to the analysis.
Non-Parametric Distribution: This class of distributions is used in cases where assumptions
about the pattern or form of the underlying probability distribution from which the data are
drawn are not needed. Typically these are used in cases where an attribute of the population
needs description; its relationship with another attribute needs to be determined and/or
differences on that attribute across population, time or related constructs needs to be derived
without an underlying population being distributed in a certain form or requiring interval
level measurement. An example of this type of test is the Wilcoxon rank-sum test.
This applies to situations where very weak assumptions have been made about the actual
form of the distribution of observations.
Applications
In the field of statistics sampling is most commonly used as it is almost impossible to include
each and every member of the population under consideration. A population typically is
composed of a very large number of observations.
There are real-world scenarios where the sample set is very small in size for example less
than 30. This as well the fact that it is diifficult to determine a pattern associated with the
distribution and interval scale measurement has not been made justifies the use of non-

80 | P a g e

parametric distributions for analysis purposes. This is combined with a very desirable
characteristic that very few assumptions about the pattern of behaviour are made in this
method.
Parametric methods should only be used when the assumptions about the distribution of the
underlying are well met and any violation justifies the use of non-parametric methods.
Whenever there is convergence between the results of parametric and non-parametric
analysis the former can be used. In practice most research questions are bivariate and if the
bi-variate results of both types of tests converge then it would be good to use parametric
techniques.
Comparison of Results
The results of the non-parametric tests are more difficult to interpret than that of the
parametric tests.
Whenever the underlying assumption about the distribution of the population holds
parametric tests necessarily produce more accurate results than non-paramteric methods.
An Example
Patients in a hospital have been classified based on their gender and their duration of stay
needs to be compared. The distribution for females is strongly skewed while that of males is
not. The median of both shows strong convergence whereas the means show strong
inequality. The parametric test is less suitable in this case as the assumption of normality is
not reasonable and a non-parametric approach is more advisable.
Properties of Log-Normal Distribution
Definition
If the logarithm to the power of the variable x is normally distributed then the variable itself
is said to be log-normally distributed. In other words if ln(x) is normally distributed then the
variable x is supposed to have a log-normal distribution.
The probability density function for this variable is as follows:

In this equation and are the mean and the standard deviation of the variables natural
logarithm.
A typical log-normal function looks as depicted in the graph below:

81 | P a g e


The plot of the log-normal distribution for various values of the standard deviation is as
below:

Occurrence
In financial markets the returns on asset prices are assumed to be normally distributed. If the
return is denoted by the following equation:
r = (P1 P0) / P0
Where P0 and P1 are the prices at time 0 and 1 respectively, then in theory it is possible that
P1 might turn out to be negative as r could end up below -1. In order to account for such
situations it would be safe to use the log-normal distribution.
If a random variable X is defined as P1/P0 and the logartihm of this variable ln(X) is
normally distributed, since X can never be negative it means that P1 can never be negative.

82 | P a g e

However in the real world it is rare that stock prices are negative as businesses do not have
situations where they have large liabilities or outgoing cashflows. Also in practice when the
changes in price are small and the time period is not that large the possibility of having a
negative price is very small.
Applications
This pattern finds wide application in finance and one of the most famous applications is in
the Black-Scholes option pricing model commonly used to value options. This model
assumes that the returns of commonly used financial asset values like stock prices or foreign
exchange rates or price indices or stock market indices are log-normally distributed. Which
means the log of the returns should be normally distributed.
However there have been several situations where this assumption fails as in case of very
sudden changes in market factors like in the case of stock market crashes or situations
involving economic collapses like the Asian Financial Crisis which saw liquidity drying up in
the market rapidly causing dramatic changes in the shape of the yield curve with short term
rates suddenly sky rocketing. In this case the distributions develop very fat tails and models
other than Black-Scholes like log-levy distributions are used to factor in this effect.
Independent and Identically Distributed Variables
Definition
I.I.Ds or independent and identically distributed variables are commonly used in probability
theory and statistics and typically refer to the sequence of random variables. If the sequence
of random variables has similar probability distributions but they are independent of each
other then the variables are called independent and identically distributed variables.
This is a pre-requisite for many key theorems like the Central Limit theorem which form the
basis of concepts like the normal distribution and many other statistical theories. It must be
noted that this assumption does not always hold true in the real world i.e. in practice. This is
however the default model for random variables.
Characteristics
Sum of I.I.Ds
The sum of independent and identically distributed functions has a moment generating
function and it has a continuous probability density function.

83 | P a g e


It can be shown that the characteristic function is absolutely integrable and the i.i.d follows a
continuous and bounded uniform continuous density function given by:

Expected Value and Variance of average of I.I.Ds
As above lets assume that there are n independent and identically distributed random
variables and we take the average of them.
The resulting variable is given by the following equation:

84 | P a g e



Financial Volatility
Financial series data are adequately expressed by Gaussian distributions. In order to calculate
the volatility of a series of financial data like the Brazilian real / US$ exchange rates they are
expressed as a series of reduced independent and identically distributed variables which form
a best fit for the real world data.
Exponential law when applied to this set of reduced variables helps explain their volatilities.
It is to be noted that this is based on the assumption that the stochastic process always
exhibits a characteristic period.



85 | P a g e


Applications
Some of the most common uses of I.I.Ds are illustrated in their use in the following
situations:
A series of consequent fair or unfair tosses of a coin
A series of consequent fair or unfair rolls of dice
A series of results of fair or unfair roulette wheel spins
Some of the other most common applications are in signal or image processing and in testing
the hypotheses of the means of random variables which assumes the central limit theorem.
Linear Combinations of Random Variables
The joint distribution of a particular pair of linear combinations of random variables which
are independent of each other is a bivariate normal distribution. It forms the basis for all
calculations involving arbitrary means and variances relating to the more general bivariate
normal distribution. The property of rotational symmetry implies that the joint distribution of
any two linear combinations of the two variables is bivariate normal.
The same concept can also be extended to linear combinations of any number of independent
normal variables. By applying scaling the condition the two variables need not be standard in
nature.
If there are two linear combinations of independent normal variables V and W then their joint
distribution is bivariate normal. The parameters of the bivariate distribution can then be easily
calculated.
The following result is implied from the above definitions
Two linear combinations of independent normal variables are independent only when the two
sets of data are not correlated with each other.
What can be said about linear combinations of random variables can also be said about
functions of random variables that are formed by a linear relationship. A function of random
variables is itself a random variable and it can either be built from linear or non-linear
relationships between variables.
Given that Y is a linear combination of variables X1, X2,,Xp and constants c1, c2, , cp
Y= c1X1 + c2X2 + + cpXp





86 | P a g e


Multi-Variate Normal Distribution
The bivariate case which applies to linear combinations of two sets of independent random
variables can also be applied in the case of linear combinations of several sets of independent
random variables and is called a multivariate normal distribution.
The independence property holds for the several sets of variables only if the covariance
between each and every one of the sets of data is zero i.e. the changes in value between any
two variables are not linked to each other in any way. However, it must be noted that the
reverse is not necessarily true i.e. if two or more sets of data are uncorrelated it does not
necessarily mean they are independent.
The multivariate normal distribution has some very important properties.
If X is distributed multivariate normally:
Linear combinations of X are normally distributed
All subsets of X are multivariate normally distributed
The zero covariance between pairs of variables indicated the variables are
independent
Conditional distributions of X are also mutlivariate normal
Dependent Normal Variables
In the case the normal variables have a relationship of dependence among them i.e. there is
covariance among the sets of data in other words they display correlation trends amongst
each other then any possible linear combinations between these sets of data will never be
normally distributed.
This issue is addressed by the concept of copulas which try to measure the degree of
dependence between two sets of variables.

EPILOGUE
This book does not end here and we will keep it updated in subsequent editions as
well.
So we hope you keep enjoying our books, have a nice reading experience.
Thanks

Professor Alberetto Albarak
University of Turkey
Department of Undergraduate Studies

You might also like