You are on page 1of 31

Descriptive Statistics - 1.

c
  

c  

  
  

SDC
c 
Descriptive Statistics - 1.2

 
Dealing with decision problem when the face of
uncertainty are important.
Descriptive Statistics
Sampling and Sampling Distributions
Point and Interval Estimation
Hypothesis Testing
Non-parametric Test - Chi-square Test
Analysis of Variance

SDC
c 
Descriptive Statistics - 1.3

   

Time Series and Forecasting


Survey and sampling methods
Multivariate Analysis
Bayesian Statistics and Decision Analysis

SDC
c 
Descriptive Statistics - 1.4

  c


  

c

SDC
c 
Descriptive Statistics - 1.5

  c


  
ÿ Population and sample
ÿ Measures of Central Tendency
Mean, Median, Mode
ÿ Measures of Dispersion
Variance, Standard deviation
Percentile, Inter-quartile range
ÿ Grouped data and histogram
ÿ Other data representations

SDC
c 
Descriptive Statistics - 1.6



 c

ÿ 
The 
 consists of the set
of all measurements in which the investigator is
interested. The population is also called the

ÿ c
 A sample is a subset of measurements
selected from the population. Sampling from the
population is often done randomly i.e. such that
every possible sample of n elements will have
equal chance of being selected. A sample
created in this way is called 
 

or 
 

SDC
c 
Descriptive Statistics - 1.7

!"


A medical manufacturer interested in


marketing a new drug may be required
the Food and Drug Administration (FDA)
to prove that the drug does not cause any
serious side effect.
The sampling was made by selecting a
sample of people randomly, the result of
tests of drug using on this sample may
then be used in a statistical inference
about the entire population of people who
may use the drug if it will be introduced.

SDC
c 
Descriptive Statistics - 1.8

   $      

 
 

c  c 

c  #    c 

SDC
c 
Descriptive Statistics - 1.9

%
$& 
  '
Mode The mode of a data set is the value
that occurs most frequently
Mean
Arithmetic Mean - AM
Given a set of data , the arithmetic
mean is defined as follows:
 Ê  2


This kind of mean is the most frequently used.

SDC
c 
Descriptive Statistics - 1.10

%
$& 
  '


 %
( %
2

Ê    Ä


This kind of mean is used when dealing


with velocity.

SDC
c 
Descriptive Statistics - 1.11

%
$& 
  '
Ê  
ÿ Population Mean  
ð
Ê  
ÿ Sample Mean  

Median
The median of a set of observations is a
special point, it lies in position that half of the
data lie below it and half above it.

SDC
c 
Descriptive Statistics - 1.12

!"
)
4  
      
 
 

Set 1: Ordering 7, 9, 15, 18, 20; median is 15
Set 2: Ordering 15.8 20.7 21.1 22.5 33.4
40.3

Median = (21.1 + 22.5)/2 = 21.8

SDC
c 
Descriptive Statistics - 1.13

%
 $

Variance and Standard Deviation


The variance of a set of observations is the
average squared deviation of the data points
from their mean.

Ê  ·  Ä
Sample Variance   
·
ð  The denominator is of (

SDC
c 
Descriptive Statistics - 1.14

%
 $

Variance and Standard Deviation


Population Variance
ð
Ê   ·  Ä
  
ð

The standard deviation of a set of


observations is the square root of the
variance of the set

SDC
c 
Descriptive Statistics - 1.15

%
 $

Percentiles
The Pth percentile of a group of numbers is that
value below which lie P% (P percent) of the
numbers in the group. The position is given by
(n+1)* P /100 where n is the number of data
points. (GRE , GMAT Test)

SDC
c 
Descriptive Statistics - 1.16

%
 $
Quartiles
The percentage points that break the data
set into 4 groups by the quarters-1st
quarter, 2nd quarter and 3rd quarter
‡ 1st quartile Q1 is the 25th percentile.
‡ 2nd quartile Q2 is the 50th percentile.

‡ 3rd quartile Q3 is the 75th percentile.

Inter-Quartile Range IQR = Q3 - Q1


SDC
c 
Descriptive Statistics - 1.17

%
 $

Example 1.3.
Given a data set including 22 points:
88, 56, 64, 45, 52, 76, 54, 79, 38, 98, 69, 77, 71,
45, 60, 78, 90, 81, 87, 44, 80, 41. Find the 20th,
30th and 90th percentiles. Also find the IQR.
What are mean, mode and median? What is the
variance of the set ?
c cc

SDC
c 
Descriptive Statistics - 1.18

* 


  

 ClassesWe divide the data values into classes
which have the same length and cover all data points.
Each class represents for a  observation value.
‡ Frequencies  The number of observations in each
class. Total frequencies is number of observations 
The relative frequency of each class is the ratio of
individual frequency and 
 Histogram

SDC
c 
Descriptive Statistics - 1.19

* 


  

‡ Mean and Variance of grouped data

Population     Ê   Ä ð   

 Ê     Ä Ä ð
   

  Ê  Ä 2      Ê 
  Ä Ä 2
Sample  
 

 

Where K is number of classes, n is number


observations of sample.

SDC
c 
Descriptive Statistics - 1.20

* 


  

Example1.4
The number of errors in a text books was found.
Number of errors per page is placed in column
(mi) while column (fi) shows the number of pages
contains errors. The following table and charts
show histogram of errors distribution:

SDC
c 
Descriptive Statistics - 1.21

Example1.
1 2 .2
1 1 1 .2 1 1
2 1 .2 2
.1 2 1 1
1 . 1 2 2
2 . 1 2
1 2 2 1

. .2 .2

.2
.2
1
.2 .1
2
.1
.1 .
. . 1

SDC
c 
Descriptive Statistics - 1.22

   c


  
Index numbers
Simple index numbers
A index number is a number that
measures the relative change in a set of
measurements over time.
Index number for period i = 100 (value in period i /
value in base period)

SDC
c 
Descriptive Statistics - 1.23

     c  
+ , -
3 121 1 . . 1
121 1 . . 1
122 1 . 2 .31
133 1 . 1 3. 3
2
13 112.3 .1
2
13 11 . . 3 ric e
1
1 3 11 .1 2 1 . e
1
1 11 . 1 .
1 1 11 . 1 .
2 1 12 . 2 1 . 1
3 1 2 133. 113.2
1 13 . 1 11 . 3
23 1 . 3 1 . 3
2 2 . 12 1 . 2

SDC
c 
Descriptive Statistics - 1.24

   c


  

Consumer Price Index - Laspeyres Index


Laspeyres Index gives us a measurement for
a change of quantity and price of itemsÔ

SDC
c 
Descriptive Statistics - 1.25

   c


  
Items 1993 1994 1995
Price Quantity Price Quantity Price Quantity

u       



     
      
      
u
      
       
       

       

Ê  

 2   Ä 
Ê  

SDC
c 
Descriptive Statistics - 1.26

   c


  
ÿ Compute the Laspeyres Index:
Select year 1993 as a base year
 For 1993: Sum of quantity x price = 29594

 For 1994: Sum of quantity x price = 31413

 For 1995: Sum of quantity x price = 30546

Laspeyres Index:
 For 1993: 100

 For 1994: 106.15

 For 1993: 103.22

SDC
c 
Descriptive Statistics - 1.27

   c


  
Stem-and-Leaf Displays

A way for re-arranging data to allow the


data ³speak for themselves´.
Example
Given the data set: 11, 12, 12, 13, 14, 15, 15, 16,
20, 21, 21, 21, 21, 22, 25, 25, 26, 27, 28, 29, 29,
31, 32, 34, 35, 36, 38, 41, 42, 45, 47, 50, 52, 55,
60, 62

SDC
c 
Descriptive Statistics - 1.28

   c


  

 c (
 (
$ 
'
1 12234556
2 0111125567899
3 124568
4 1257
5 025
6 02

SDC
c 
Descriptive Statistics - 1.29

   c


  
Box-Whiskers plot
c
 2
 3
 2

" "
4 4
 
c   

$  % 
 $ 
 $   $ 
. (0.  . ./ ./10. 
.(/.   ./1/. 
.

SDC
c 
Descriptive Statistics - 1.30

Examples for Box-Whiskers plot


#  5

$  5

c 

c  

c
 

 $ $

 

SDC
c 
Descriptive Statistics - 1.31

Box-Whisker plot (or Box plot) are


useful for the following purposes.
‡To identify the spread of data set.
‡To identify the location of data set based on
median.
‡To identify possible skewness of the distribution.
‡To identify suspected outlier and outlier.
‡To quickly compare data sets.
  
 ! "

# !$ %
 
#   &
 "


' "  ! %  (  % ! 

Look at example in SPSS


 "
$      u
$   '  &
 
 (

SDC
c 

You might also like