You are on page 1of 49

MISS FARAH DAYANA

LEARNING OUTCOMES
At the end of this lecture, students should be able to;
To understand the process of data collection.
To understand the various types of data and variables
To demonstrate ways to organize data using Frequency
Distributions.

Introduction
Statistics is a field of study concerned with

Collection, organization, summarization, and


analysis of data

Drawing inferences about a body of data when


only a part of the data is observed.

Biostatistics is the application of statistics to a wide


range of topics in biology.
3

Data:
The raw material of statistics is data.
We may define data as figures. Figures results from the

process of counting or taking a measurement.


For example:
- When a hospital administrator counts the number of

patients (Counting)
- When a nurse weighs a patient (Measurement)

Sources of Data
Routinely kept records
For an example: Hospital medical records contain
immense amounts of information on patients
2. External records
The data needed to answer a question may already exist
in the form of published reports or research
literature etc.
For an example: Number of mortality by dengue fever in
2012
1.

3. Surveys
The sources may be a survey, if the data needed is about
answering certain questions.
4. Experiments
Frequently the data needed to answer a question are
available only as the result of an experiment.
For an example:
If a researcher wishes to know how effective a drug is as a
treatment of cancer

Collecting data
Data can be collected using a questionnaire or a data collection sheet.

A questionnaire is used when you wish to ask a sample of


people a series of structured questions relevant to your line of
enquiry.

A data collection sheet or observation sheet is used when


recording results involving counting, measuring or observing. It
can also be used to collect the answers to a few simple questions.

Data can also be collected from secondary sources such as the Internet,
newspapers or reference books.

Designing a questionnaire
A better question would be:
How much of the Olympics coverage did you watch?
Tick one box only.
None
Less than 1 hour a day
Between 1 to 2 hours a day
More than 2 hours a day

Every eventuality has been accounted for and the person answering the
question cannot give another choice.

Designing a questionnaire
A scale can be used when asking for an opinion.
For example,

How would you rate the leisure facilities available in your local area? Tick one
box only.
Excellent

Good

Satisfactory

Poor

Unsatisfactory

Designing a data collection sheet


A data collection sheet can be used to record data that comes from counting,
observing or measuring.
It can also be used to record responses to specific questions.
For example, to investigate a claim that the amount of TV watched has an impact
on weight we can use the following:

age

gender height (cm) weight (kg) hours of TV watched per week

Using a tally chart


When collecting data that involves counting something we often use a tally chart.

For example, this tally chart can be used to record peoples favourite snacks.

favourite snack

tally

frequency

crisps

13

fruit

nuts

sweets

The tally marks are recorded, as responses are collected,


and the frequencies are then filled in.

Variable
When collecting or gathering data we collect data
from individuals cases on particular variables.
A variable is a unit of data collection whose value
can vary.
It is a characteristic that takes on different values
For an example:
- Heart rate
- The heights of adult males
- The weights of preschool children
- The ages of patients

12

Types of Variable

Quantitative
It can be measure
For example:
-heights
-weights
-ages

Qualitative
Many characteristics are not
capable of being measured.
Some of them can be ordered or
ranked.
For example:
-Race
-Social Class

13

Categorical data
Categorical data is data that is non-numerical.

For example,

favourite football team,


eye colour,

birth place.

Sometimes categorical data can contain numbers.


For example,

favourite number,
last digit in your telephone number,
most used bus route.

Discrete and continuous data


Numerical data can be discrete or continuous.
Discrete data can only take certain values.
For example,

shoe sizes,
the number of children in a class,
the number of sweets in a packet.

Continuous data comes from measuring and can take any value within a
given range.

For example,

the weight of a banana,

the time it takes for pupils to get to school,


the height of 13 year-olds.

There are four types of data or levels of


measurement:
1. Nominal

2. Ordinal

3. Interval

4. Ratio

Nominal data
Nominal or categorical data is data that comprises of categories
that cannot be rank ordered each category is just different.
The categories available cannot be placed in any order and no
judgement can be made about the relative size or distance from
one category to another.

Nominal data
Examples:
What is your
gender? (please tick)

Did you enjoy the


film? (please tick)

Male

Yes

Female

No

Ordinal data
Ordinal data is data that comprises of categories that can be
rank ordered.
Similarly with nominal data the distance between each
category cannot be calculated but the categories can be
ranked above or below each other.

Ordinal data
Example:
How satisfied are you with the level
of service you have received? (please
tick)

Very satisfied
Somewhat satisfied
Neutral
Somewhat dissatisfied

Very dissatisfied

Interval and ratio data


Both interval and ratio data are examples of scale data.
Scale data:

data is in numeric format (50, 100, 150)


data that can be measured on a continuous scale
the distance between each can be observed and as a
result measured
the data can be placed in rank order.

Interval data
Interval data measured on a continuous scale and has no
true zero point.
Examples:

Time moves along a continuous measure or seconds,


minutes and so on and is without a zero point of time.
Temperature moves along a continuous measure of
degrees and is without a true zero.

Ratio data
Ratio data measured on a continuous scale and does have
a true zero point.
Examples:

Age
Weight

Height

Population
The entire pool from which a statistical sample is drawn

For an example: The weights of all the children enrolled in


a certain elementary school.

Sample
A data sample is a set of data collected and/or
selected from a statistical population

For an example: The weights of children in Class A and


B of that elementary school
24

Frequency Distributions
After collecting data, we need to organize

and simplify the data so that it is possible to


get a general overview of the results.
One method for simplifying and organizing
data is to construct a frequency
distribution.

25

Frequency Distributions (cont.)


A frequency distribution is an

organized tabulation showing exactly


how many individuals are located in
each category on the scale of
measurement

26

Frequency Distribution Tables


A frequency distribution table consists of at
least two columns - one listing categories on the
scale of measurement (X) and another for
frequency (f).
In the X column, values are listed from the highest
to lowest, without skipping any.
The sum of the frequencies should equal N.

27

A third column can be used for the proportion (p) or

relative frequency for each category: p = f/N. The sum


of the p column should equal 1.00.

28

29

Frequency Table
A research study has been conducted
examining the number of children in the
families living in a community. The
following data has been collected based on
a random sample of n = 30 families from
the community.

2, 2, 5, 3, 0, 1, 3, 2, 3, 4, 1, 3, 4, 5, 7, 3, 2, 4,
1, 0, 5, 8, 6, 5, 4 , 2, 4, 4, 7, 6
Organize this data in a Frequency Table!

30

X=No. of
Children
0
1
2
3
4
5
6
7
8

Count
(Frequency)
2
3
5
5
6
4
2
2
1

Relative Freq.
2/30=0.067
3/30=0.100
5/30=0.167
5/30=0.167
6/30=0.200
4/30=0.133
2/30=0.067
2/30=0.067
1/30=0.033
31

Grouped Frequency
Distribution
Sometimes, however, a set of scores covers a wide

range of values. In these situations, a list of all the X


values would be quite long - too long to be a simple
presentation of the data.
To remedy this situation, a grouped frequency
distribution table is used.

32

Grouped Frequency Distribution (cont.)


In a grouped table, the X column lists groups of
scores, called class intervals, rather than
individual values.
These intervals all have the same width, usually a
simple number such as 2, 5, 10, and so on.

33

Grouping data
Tips for grouping data
Tips for grouping lots of data
Choose interval widths that reduce your data to 5 to 10
intervals.

Chapter 2

10

15

20

25

30

35

34

Grouping data
Tips for grouping data
Choose meaningful intervals.

Which is easier to understand at a glance?

10

15

20

25

30

35

13

16

19

22

or

Chapter 2

10

35

Grouping data
Tips for grouping data
Interval widths must be the same.

10

15

20

25

30

35

30

33

35

NOT

Chapter 2

10

20

22

36

Grouping data
Tips for grouping data
Intervals cannot overlap.

5-10

11-15

16-20

21-25

26-30

31-35

36-40

25-30

30-35

35

NOT

5-10

Chapter 2

10-15

14-20

20-26

37

Grouping data
An example

Chapter 2

38

Grouping data
An example

Chapter 2

39

Cumulative Frequency Distribution


Cumulative frequency distribution
Shows how many cases (data points) have been

accounted for out of the total number of cases (data


points).

Chapter 2

40

Cumulative Frequency Distribution


How many data points have accounted for as each

group is displayed.

Chapter 2

41

Cumulative Frequency Distribution


Cumulative frequencies can also be illustrated
using percentages.

Chapter 2

42

Cumulative Relative Frequency

- the sum of the relative frequencies for all values at or below


the given value expressed as a proportion;

43

MathAnxiety
Relative Cumulative Cumulative
Scores
Freq Freq
Freq
Relative Freq

0.05

0.05

0.09

0.14

0.14

0.28

0.18

10

0.46

0.23

15

0.69

15

0.69

0.09

17

0.78

0.14

20

0.92

0.05

21

0.97

10

0.05

22

1.02

Laws Covering Sales of Firearms: Increase


Restrictions( 2000)?
Men(N=493)
Women(N=538)

More

Less

Same

No opinion

256
387

39
11

193
129

5
11

Men and Firearm Restrictions: Frequency


Distribution(N=493)
F

CF

RF

CRF

More

256

256

.52

.52

Less

39

295

.08

.60

Same

193

488

.39

.99

No opinion

493

.01

Example

These data represent the record high temperatures for


each of the 50 states. Construct a grouped frequency
distribution for the data using 7 classes.

112 100 127 120 134 118 105 110 109 112
110 118 117 116 118 122 114 114 105 109
107 112 114 115 118 117 118 122 106 110
116 108 110 121 113 120 119 111 104 111
120 113 120 117 105 110 118 112 114 114

Class limits
100-104

Class
boundaries
99.5-104.5

Frequency
2

Relative
frequency
0.04

Cumulative
frequency
2

105-109

104.5-109.5

0.16

10

110-114

109.5-114.5

18

0.36

28

115-119

114.5-119.5

13

0.26

41

120-124

119.5-124.5

0.14

48

125-129

124.5-129.5

0.02

49

130-134

129.5-134.5

0.02

50

THANK YOU!

49

You might also like