Professional Documents
Culture Documents
0, 1, 2, 3, 4, 5, 6, 7, 8
Does it matter ?
1-1
How to find overall spread ?
For each data point we will find deviation from mean i.e.
data point minus mean
0, 1, 2, 3, 4, 5, 6, 7, 8
Deviations are
1-2
How to avoid cancelling out ?
Take the absolute value of the deviations and average those
1-3
Compare the rivers using MAD & SD
River 1 Devn Abs Dev dev Sqrd River 2 Devn Abs Dev dev Sqrd
0 3.00
1 3.25
2 3.50
3 3.75
4 4.00
5 4.25
6 4.50
7 4.75
8 5.00
MEAN Avg Dev MAD Std dev MEAN Avg Dev MAD Std dev
4 4
1-4
Compare the rivers using MAD & SD
River 1 Devn Abs Dev dev Sqrd River 2 Devn Abs Dev dev Sqrd
0 -4 3.00 -1.00
1 -3 3.25 -0.75
2 -2 3.50 -0.50
3 -1 3.75 -0.25
4 0 4.00 0.00
5 1 4.25 0.25
6 2 4.50 0.50
7 3 4.75 0.75
8 4 5.00 1.00
MEAN Avg Dev MAD Std dev MEAN Avg Dev MAD Std dev
4 0 4 0
1-5
Compare the rivers using MAD & SD
River 1 Devn Abs Dev dev Sqrd River 2 Devn Abs Dev dev Sqrd
0 -4 4 3.00 -1.00 1.00
1 -3 3 3.25 -0.75 0.75
2 -2 2 3.50 -0.50 0.50
3 -1 1 3.75 -0.25 0.25
4 0 0 4.00 0.00 0.00
5 1 1 4.25 0.25 0.25
6 2 2 4.50 0.50 0.50
7 3 3 4.75 0.75 0.75
8 4 4 5.00 1.00 1.00
MEAN Avg Dev MAD Std dev MEAN Avg Dev MAD Std dev
4 0 2.2 4 0 0.56
1-6
MAD & SD tell us River 1 is unsafe
River 1 Devn Abs Dev dev Sqrd River 2 Devn Abs Dev dev Sqrd
0 -4 4 16 3.00 -1.00 1.00 1.00
1 -3 3 9 3.25 -0.75 0.75 0.56
2 -2 2 4 3.50 -0.50 0.50 0.25
3 -1 1 1 3.75 -0.25 0.25 0.06
4 0 0 0 4.00 0.00 0.00 0.00
5 1 1 1 4.25 0.25 0.25 0.06
6 2 2 4 4.50 0.50 0.50 0.25
7 3 3 9 4.75 0.75 0.75 0.56
8 4 4 16 5.00 1.00 1.00 1.00
MEAN Avg Dev MAD Std dev MEAN Avg Dev MAD Std dev
4 0 2.2 2.6 4 0 0.56 0.65
1-7
Another example, here MAD is the same for both
data sets, but Std Dev is different for the two sets
Dev Dev
data 1 Devn Abs Dev Sqrd data 2 Devn Abs Dev Sqrd
4 -6 6 36 5 -5 5 25
6 -4 4 16 5 -5 5 25
10 0 0 0 10 0 0 0
14 4 4 16 15 5 5 25
16 6 6 36 15 5 5 25
MEAN Avg Dev MAD Std Dev MEAN Avg Dev MAD Std Dev
10 0 4 4.6 10 0 4 4.5
1-8
Let us look at the data closely without all the
workings of MAD and SD
data 1 data 2
4 5
6 5
10 10
14 15
16 15
MEAN MEAN
10 10
1-9
Now we see the data with the workings of MAD
and SD
Dev Dev
data 1 Devn Abs Dev Sqrd data 2 Devn Abs Dev Sqrd
4 -6 6 36 5 -5 5 25
6 -4 4 16 5 -5 5 25
10 0 0 0 10 0 0 0
14 4 4 16 15 5 5 25
16 6 6 36 15 5 5 25
MEAN Avg Dev MAD Std Dev MEAN Avg Dev MAD Std Dev
10 0 4 4.6 10 0 4 4.5
1-10
Why do we prefer SD to MAD?
When the spread is less the values of Standard deviation
and Mean Absolute Deviation are close to each other.
1-11
Working with MS- Excel
(pronounced sigma) is the symbol for population standard deviation
s (always small letter) is the symbol for the sample standard deviation,
1-12