Professional Documents
Culture Documents
Levels of measurement
In statistics and quantitative research methodology, various attempts have been made to classify
variables (or types of data) and thereby develop a taxonomy of levels of measurement or scales
of measure. The various levels of measurement are as shown below:
Nominal Scale/ data
This scale categorizes and differentiates items based only on their names and other qualitative
classifications they belong to. The items cannot be ranked. E.g. Names of towns in Ghana, Days
of the week, religion. These are called categorical variables.
Ordinal Scale/data
The data on this scale can be ranked but intervals are not the same. Eg. Positions in a beauty
contest; level of education.
Interval Scale/ Data
The items can be ranked and intervals are the same, but the position of zero is arbitrary.
Examples include temperature with the Celsius scale, which has an arbitrarily-defined zero point
(the freezing point of a particular substance under particular conditions), date when measured
from an arbitrary epoch (such as AD) and direction measured in degrees from true or magnetic
north.
Ratio Scale or Ratio Data
We can rank items on this scale and their intervals are also the same. Again and there is a true
zero. Eg. Income, length, number of years of schooling, ages etc.
INFERENTIAL STATISTICS: This involves the application of statistical techniques (e.g. chisquare tests etc) and sample data to draw conclusions about the population parameter.
DEPENDENT AND INDEPENDENT VARIABLES: Dependent Variables are those that are
only measured or registered whereas the independent variables are those that are manipulated
to influence the outcome of the dependent variables. For instance, if we want to measure effects
of years of schooling on salaries, then salaries will be the dependent variable while years of
schooling will be independent variable.
Coefficient of determination
This explains the amount of variability in the independent variable that is explained by
variability in the dependent variable.
Coefficient of determination = r2 X 100%
Regression Analysis
Regression is a technique used to analyse the relationship between 2 or more variables and how
one variable affects the other. It is used to establish an equation linking the two variables. There
are various types of regression.
1. Simple Linear Regression: This examines the relationship between two variables (one
dependent and one independent variable) measured usually on the ration scale. E.g. Age
and weight.
2. Multiple Linear Regression: This measures the relationship between one dependent
variable and several independent variables. For instance, one can examine measure the
relationship between output of maize (dependent variable) and several independent
variables, including soil quality, amount of fertilizer applied, rainfall amount etc.
3. Logistic Regresion: This is used when we are interested in analyzing the relationship
between one dependent variable and several categorical independent variables. Eg. We
can analyse the relationship between modern contraceptive use (dependent or outcome
variable) and variables such as location, marital status, religion.
Trial Question
A. A social scientist is interested in establishing the degree of the relationship between
number of districts and number of hospitals in eight randomly selected administrative
regions in the Republic of Nsutapong. The table below summarizes the data he obtained
from the field.
Regions
Number of Districts Number of Hospitals
A
2
3
B
4
3
C
5
4
D
5
5
E
6
7
F
7
8
G
9
9
H
10
11
(a) Calculate the Pearsons Product-Moment Correlation Coefficient (r) between number of
districts and number of hospitals and interpret your answer.
(b) Compute the coefficient of determination and interpret your answer.
(c) Fit a linear regression model for estimating the number of hospitals (y) from a given
number of districts (x).
(d) Using your model or otherwise find the number of districts in a region with 15 hospitals.
Solution
(a) Let x represent the number of districts while y represents the number of hospitals. To
calculate the correlation we construct the table below:
x
2
4
5
5
6
7
9
10
x =48
x=48
r=
r=
r=
r=
y
3
3
4
5
7
8
9
11
y=50
y=50
xy
6
12
20
25
42
56
81
110
xy=352
x2
4
16
25
25
36
49
81
100
x2= 336
xy=352
y2
9
9
16
25
49
64
81
121
y2=374
x2= 336
y2=374
8(352)(48x50)
[8(336)(48)2 ][8(374)(50)2 ]
28162400
(26882304)(29922500)
416
(384)(492)
416
434.7
r= 0.96
Since r is 0.96, there is a strong positive correlation between the number of districts and the
number of hospitals. This means that the higher the number of districts in a region, the higher the
number of hospitals in the region.
(c ). Regression Analysis
Equation of a simple linear regression is given as y = a + bx. where
Y mean=50/8=6.25
By substitution,
=
3528(6 6.25)
3368(6)2
3528(37.5)
3368(36)
352300
b= 336288
52
b= 48 = 1.08
Given b as 1.08,
a= [6.25 (1.08 6)]
a= 6.25 6.48
xy=352
x2= 336
a= - 0.23
Since equation of a simple linear regression line is given as y = a + bx.
Substituting derived values into the equation;
y = - 0.23 +1.08x
(d) Using the model find the number of districts in a region with 15 hospitals.
We substitute Y= 15 into the equation
15= -0.23+1.08X
15.23 =1.08X
15.23/1.08 = X
X=14.1. There were 14 districts.