You are on page 1of 30

Measures of Central Tendency

Describing Data Numerically

Center and Other Measures of Variation


Location Location

Mean Range
Percentiles
Median Inter quartile Range
Quartiles
Variance
Mode
Standard Deviation
Weighted Mean
Coefficient of Variation
Measures of Center and Location
Center and Location

Mean Median Mode Weighted Mean

 xi XW 
 wx i i

x i 1

n
w i

 xi W 
 wxi i

 i 1

N
w i
Mean (Arithmetic Average)
 The most common measure of central tendency
 Mean = sum of values divided by the number of values
 Affected by extreme values (outliers)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Mean = 3 Mean = 4

1  2  3  4  5 15 1  2  3  4  10 20
 3  4
5 5 5 5
Mode

 A measure of central tendency


 Value that occurs most often
 Not affected by extreme values
 Used for either numerical or categorical data
 There may be no mode
 There may be several modes

0 1 2 3 4 5 6
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Mode = 5 No Mode
Weighted Mean

 Used when values are grouped by frequency or relative


importance of each value to the overall total.

Example: Sample of 26
Repair Projects Weighted Mean Days to Complete:

Days to Frequency
Complete
5 4 XW 
 wx
i i

(4  5)  (12  6)  (8  7)  (2  8)

6 12
w i 4  12  8  2
164
7 8   6.31 days
26
8 2
Best Measure of Location

 Mean is generally used, unless extreme values (outliers)


exist

 Then median is often used, since the median is not


sensitive to extreme values.

 Example: Median home prices may be reported for a


region – less sensitive to outliers
Matching Average to Data
Measure of Appropriate to choose Should not be used
Central when … when…
Tendency
Mean •No situation precludes it •Extreme scores
•First choice measure of •Skewed distribution
central tendency •Ordinal scale
•Nominal scale

Median •Extreme scores •Nominal scale


•Skewed distribution
•Ordinal scale
Mode •Nominal scales •Interval or ratio data, except
•Discrete variables to accompany mean or
•Describing shape of median
distribution
Shape of a Distribution

 Describes how data is distributed


 Symmetric or skewed

Left-Skewed Symmetric Right-Skewed

Mean < Median < Mode

Mean = Median = Mode Mode < Median < Mean


(Longer tail extends to left) (Longer tail extends to right)
Example

 Five houses on a hill by the beach


$2,000 K
House Prices:

$2,000,000
$500 K
500,000
$300 K
300,000
100,000
100,000
$100 K

$100 K
Summary Statistics
 Mean: ($3,000,000/5)
House Prices:
= $600,000
$2,000,000
500,000  Median: middle value of
300,000
ranked data
100,000
100,000 = $300,000

Sum 3,000,000
 Mode: most frequent value
= $100,000
Method’s Nature of Data
Name Ungrouped Data Grouped Data
Direct
Method
Indirect or
Short-Cut
Method
Method of
Step-
Deviation
Where
Indicates values of the variable .
Indicates number of values of .
Indicates frequency of different groups.
Indicates assumed mean.
Indicates deviation from i.e,

Step-deviation and Indicates common divisor


Indicates size of class or class interval in case of
grouped data.
Summation or addition.
Example
The following data shows distance covered by 100
persons to perform their routine jobs.
Distance (Km)
Number of
Persons

Calculate Arithmetic Mean by Step-Deviation


Method; also explain why it is better than direct
method in this particular case.
Solution
The given distribution belongs to a grouped data
and the variable involved is ages of “distance
covered”. While the “number of persons” Represent
frequencies.

Distance Number Mid


Covered of Persons Points fu
in (Km) f x
0-10 10 5 -1 -10
10-20 20 15 0 0
20-30 40 25 +1 40
30-40 30 35 +2 60
Total ∑f= 100 ∑fu = 90
Now we will find the Arithmetic Mean as

Where

, , and

Km
Explanation:
Here from the mid points (x) it is very much clear
that each mid point is multiple of 5 and there is also a gap
of 10 from mid point to mid point i.e. class size or interval
(h). Keeping in view this, we should prefer to take
method of Step-Deviation instead of Direct Method.
Example
The following frequency distribution showing the
marks obtained by 50 students in statistics at a
certain college. Find the arithmetic mean using (1)
Direct Method (2) Short-Cut Method (3) Step-
Deviation.

Marks 20-29 30-39 40-49 50-59 60-69 70-79 80-89


Frequency 1 5 12 15 9 6 2
Step-
Direct Short-Cut
Deviation
Method Method
Method
Marks f x fx D=x-A fD fu
20-29 1 24.5 24.5 -30 -30 -3 -3
30-39 5 34.5 172.5 -20 -100 -2 -10
40-49 12 44.5 534 -10 -120 -1 -12
50-59 15 54.5 817.5 0 0 0 0
60-69 9 64.5 580.5 10 90 1 9
70-79 6 74.5 447 20 120 2 12
80-89 2 84.5 169 30 60 3 6
Total 50 2745 20 2
(1) Direct Method:

(2) Short-Cut Method:

Where ;A= 54.5

Marks
(3) Step-Deviation Method:

Where ;A=54.5 h = 10

=54.5 + 0.4 = 54.9 Marks


Exercise

HEIGHT ( in) Mid value (X) FREQUENCY ( F) fX

60-62 61 5

63-65 64 18

66-68 67 42

69-71 70 27

72-74 73 8

100
Exercise

OVERTIME NO OF MID VALUE d = (Mid Freq x d


(HRS) EMPLOYEES value –A)/
class interval
10 – 15 11 12.5

15 – 20 20 17.5

20 – 25 35 22.5 say is
assumed mean
25- 30 20 27.5

30 – 35 8 32.5

35 -40 6 37.5

100
MEDIAN

The median of a finite list of numbers can be found by


arranging all the observations from lowest value to
highest value and picking the middle one.

For Odd number of observations:


 Median = (n+1)/2 th observations

For Even number of observations:


 Median = Average of (n/2) th and (n/2 + 1) th
observations
FORMULA for MEDIAN

MEDIAN
(n/2) - cf
= l + _____________ x h
f

l=lower limit of median class interval


cf= cumulative freq of class prior to median class
f=freq of median class
h= width of median class
n= total no of observations
Example
CALCULATE MEDIAN VALUE

AGE OF NO OF AUTOS CUMULATIVE


AUTOS (f) FREQ
0-4 13 13
4-8 29 42
8-12 48 90 ( MEDIAN
CLASS)
12-16 22 112
16-20 8 120
n=120
Other Location Measures

Other Measures of
Location

Percentiles Quartiles

The pth percentile in a data array: • 1st quartile = 25th percentile


 p% are less than or equal to this
value • 2nd quartile = 50th percentile
 (100 – p)% are greater than or = median
equal to this value
(where 0 ≤ p ≤ 100) • 3rd quartile = 75th percentile
Percentiles
The pth percentile in an ordered array of n values is the value in ith
position, where

p
i (n  1)
100
Example: The 60th percentile in an ordered array of 19 values is the
value in 12th position:

p 60
i (n  1)  (19  1)  12
100 100
Quartiles
25% 25% 25% 25%

Q1 Q2 Q3

Quartiles split the ranked data into 4 equal groups

Example: Find the first quartile

Sample Data in Ordered Array: 11 12 13 16 16 17 18 21 22

(n = 9)
Q1 = 25th percentile, so find the 25 25(9+1) = 2.5 position
100 100
so use the value half way between the 2nd and 3rd values,
so
Q1=12.5
QUARTILES

QUARTILE
i (n/4) - cf
= l + ________________ x h
f
i = 1,2,3
l = lower limit of QUARTILE class interval
cf = cumulative freq of class prior to quartile class
f = freq of quartile class
h = width of quartile class
n = total no of observations
EXERCISE

 CALCULATE Q1 AND P37 FOR THE


PREVIOUS EXAMPLE

You might also like