Statistics

Polytechnic University of the Philippines
College of Engineering
Department of Mechanical Engineering

TOTAL QUALITY
MANAGEMENT FOR ME
Basic Statistics

Submitted by:
Aromin, Albert S.
Marcelo, Christian John B.
BSME IV-1
Submitted to:
Prof. Rhodora Nicolas Buluran
Instructor

Introduction

A. Definition/Importance
- A collection of quantitative data pertaining to any subject or group, especially
when data are systematically gathered and collated
- The science that deals with the collection, tabulation, analysis, interpretation,
and presentation of quantitative data.
- Helps to understand the quality control charts that are used in business and
manufacturing processes.

Let us suppose that we want to control the diameter of the piston rings that are being
produced by us. The centerline in the X-bar chart would now represent the desired
standard size for example: diameter in millimeters, of the rings, and the center line in the
R chart would now represent the acceptable range of the rings within samples and
within specifications.

B. Two Phases of Statistics
A. Descriptive/ Deductive
- which endeavor to describe a subject or group

B. Inductive
- Endeavor to determine from limited amount of data (sample) an important
conclusion about much larger amount of data (population)

C. Collecting the data
A. Variable
- Are those quality characteristics that are measurable
B. Attributes
- Are those quality characteristics that are classified as either conforming or not
conforming to specifications
C. Continuous
- A variable that is capable of any degree of subdivision
D. Discrete
- Variables that exhibit gaps
Difference between Precision and Accuracy
Accuracy
Degree of conformity of a measure to a standard or a true value
Precision
The degree of refinement with which an operation is performed or a measurement
stated

D. Describing the data
Frequency distribution
A summarization of how the data points occur within each subdivision of observed
values or groups of observed values.
Measures of Central Tendency
It is a numerical value that describes the central position of the data or how the data
tend to build up in the center.
Measures of Dispersion
It describes how the data are spread out or scattered on each side of the central
value.

Frequency of Distribution

Ungrouped Data
Comprise a listing of the observed values
Array
It is the arrangement of raw numerical data in ascending and descending order of
magnitude
Frequency
It is the numerical value for the number of tallies.

Grouped Data
It represents a lumping together of the observed values

Steps in creating Frequency Distribution Table for Grouped Data
1. Collect data and construct a tally sheet

2. Determine the range
It is the difference between the highest observed value and the lowest observed value.

Where:
R= range
= highest number
= lowest number
Example

=2.575-2.531
= 0.044
3. Determine the cell interval
is the distance between adjacent cell midpoints

For the example problem, the answer is

Where:
h = no. of cells
Example:

4. Determine the cell midpoints
= midpoint for lowest cell

Example:

5. Determine the cell boundaries
Cell boundaries are the extreme or limit values of a cell, referred to as the upper
boundary and the lower boundary.

6. Post the cell frequency

Measures of Central tendency
- In general terms, central tendency is a statistical measure that determines a
single value that accurately describes the center of the distribution.
- Measures of central tendency are statistical measures which describe the
position of a distribution.
- The value or the figure which represents the whole series is neither the lowest
value in the series nor the highest. It lies somewhere between these two
extremes.
AVERAGE
- The statistical mean of set of observations is the average measurement in a set of
data.
Ungrouped

Example:
1.4 1.5 2.6 3.9 2.4 1.9 3.6 2.5 2.4 3.2

It can be arranged in ascending order
1.4 1.5 1.9 2.4 2.4 2.5 2.6 3.2 3.6 3.9
Getting the summation divided by total number of items, thus
2.54

GROUPED

Example:

Median
The median of a set of observations is the value that, when the observations are
arranged in ascending or descending order, satisfies the following conditions
a. If the number of observations is odd, the median is the middle value
b. If the number of observations is even, the median is the average of the two
middle numbers
Example:
1.4 1.5 2.6 3.9 2.4 1.9 3.6 2.5 2.4 3.2

It can be arranged in ascending order
1.4 1.5 1.9 2.4 2.4 2.5 2.6 3.2 3.6 3.9
Since there are ten items, therefore the 2 middle values are 2.4 and 2.5. Thus the
median will be 2.45

(
)
(
)

Mode
A. UNGROUPED
The mode of a set of observations is the specific value that occurs with the greatest
frequency
1.4 1.5 2.6 3.9 2.4 1.9 3.6 2.5 2.4 3.2
MODE = 2.4
B. GROUPED
(
)
L = exact lower limit of the modal class
numerical difference between the frequency of the modal class and the
frequency of the adjacent lower class
numerical difference between the frequency of the modal class and the
frequency of the adjacent higher class
C = class interval

Measures of Dispersion
- Measures of dispersion are descriptive statistics that describe how similar the
values to each other
- It describes how the data are spread out or scattered on each side of the
central value.
- The more similar the scores are to each other, the lower the measure of
dispersion will be
- The less similar the scores are to each other, the higher the measure of dispersion
will be
- In general, the more spread out a distribution is, the larger the measure of
dispersion will be
Which of the distributions of scores has the larger dispersion?

The left side distribution has more dispersion because the scores are more spread out.
That is, they are less similar to each other.

Measures of dispersion:
0
25
50
75
100
125
1 2 3 4 5 6 7 8 9 10
0
25
50
75
100
125
1 2 3 4 5 6 7 8 9 10
The range
Variance / standard deviation
Other Measures
Coefficient of Variation
Skewness and Kurtosis

The Range
- The range is defined as the difference between the largest score in the set of
data and the smallest score in the set of data, Xh - Xl
- What is the range of the following data:
4 8 1 6 6 2 9 3 6 9
- The largest score (Xh) is 9; the smallest score (Xl) is 1; the range is Xh Xl = 9 2
= 8
The range is used when:
- you have ordinal data or
- you are presenting your results to people with little or no knowledge of
statistics
- The range is rarely used in scientific work as it is fairly insensitive
- It depends on only two scores in the set of data, Xh and Xl
- Two very different sets of data can have the same range:
1 1 1 1 9 vs 1 3 5 7 9
Variance
- Variance is defined as the average of the square deviations:

What Does the Variance Formula Mean?
First, it says to subtract the mean from each of the scores
- This difference is called a deviate or a deviation score
( )
N
X

=
2
2

o
- The deviate tells us how far a given score is from the typical, or average,
score
- Thus, the deviate is a measure of dispersion for a given score
X d =

11 (11-19) = - 8 64
13 (13-19) = - 6 36
17 (17-19) = - 2 4
18 (18-19) = - 1 1
21 (21-19) = 2 4
24 (24-19) = 5 25
29 (29-19) = 10 100
N = 7
Computational Formula Example

What Does the Variance Formula Mean?
Variance is the mean of the squared deviation scores
The larger the variance is, the more the scores deviate, on average, away from
the mean
The smaller the variance is, the less the scores deviate, on average, from the
mean

Standard Deviation
When the deviate scores are squared in variance, their unit of measure is
squared as well
E.g. If peoples weights are measured in pounds, then the variance of the
weights would be expressed in pounds
2
(or squared pounds)
Since squared units of measure are often awkward to deal with, the square root
of variance is often used instead
The standard deviation is the square root of variance
Standard deviation = \ variance
Variance = standard deviation
2

Computational Formula
When calculating variance, it is often easier to use a computational formula which is
algebraically equivalent to the definitional formula:

o
2
is the population variance, X is the frequency, is the population mean, and N is the
number of scores

( )
( )
N N
N
X
X
X

=

o
2
2
2
2

Variance of a Sample

s
2
is the sample variance, X is a score, X is the sample mean, and N is the number of
scores
Standard Deviation

Standard Deviation
Ungrouped SAMPLE

Standard Deviation
grouped SAMPLE

Standard Deviation
grouped EXAMPLE
( )
1
2
2
=

N
X X
s

Coefficient of variation
- A measure that allow statistician to compare the variation of two or more
different variables
- It is used to compare distributions with different means.
- The distribution with the largest coefficient of variation value has the greatest
relative variation.
CV =

Example 1
The mean of parking tickets issued over a 4 month period is 90. the standard
deviation was 5. The average revenue was 5,400 and the standard deviation is 775.
Compute the variations of the two variables.
CV(tickets) = 5/90 x 100 = 5.56%
CV(revenue) = 775/ 5400 x 100 = 14.35%
Since CV(revenues) > CV(tickets), more variability at recorded revenues.

Skewness and Kurtosis
Measure of Skew

If s
3
< 0, then the distribution has a negative skew
If s
3
> 0 then the distribution has a positive skew
If s
3
= 0 then the distribution is symmetrical
The more different s
3
is from 0, the greater the skew in the distribution

Kurtosis
Kurtosis measures whether the scores are spread out more or less than they
would be in a normal distribution

When the distribution is normally distributed, its kurtosis equals 3 and it is said to
be mesokurtic
When the distribution is less spread out than normal, its kurtosis is greater than 3
and it is said to be leptokurtic
When the distribution is more spread out than normal, its kurtosis is less than 3 and
it is said to be platykurtic

Measure of Kurtosis
The measure of kurtosis is given by:

Difference between Skewness and Kurtosis

Skewness > extent and direction
Kurtosis > degree of peakedness

Collectively, the variance (s
2
), skew (s
3
), and kurtosis (s
4
) describe the shape of the
distribution

( )
N
N
X X
X X
s
|
|
|
|
|
.
|
\
|
=
4
2
4
References:

Cruz, Myrna et al. Statistics and Probability Theory. 2011 ed.
Balasubramanian , P., & Baladhandayutham, A. (2011).Research methodology in
library science. (pp. 164-170). New Delhi: Deep & Deep Publications.
Busha,Charles, H., & Harter,Stephen, P. (1980). Research methods in librarianship:
techniques and interpretation. (pp. 372-395). New York: Academic Press.
Elvers, Greg C. Measures of Dispersion. academic.udayton.edu/
gregelvers/psy216/ppt/dispersion.ppt
Tutor Vista.com. Central Tendency. Measures of Central Tendency.
http://math.tutorvista.com/statistics/central-tendency.html.
Tutor Vista.com. Control Charts - Types, Formula, Examples & Tables
http://math.tutorvista.com/statistics/control-charts.html.

Statistics

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics

Uploaded by

Copyright:

Available Formats

Polytechnic University of the Philippines

= midpoint for lowest cell

You might also like