You are on page 1of 27

BIAS AND CONFOUNDING

Nigel Paneth
HYPOTHESIS FORMULATION AND
ERRORS IN RESEARCH

All analytic studies must begin with


a clearly formulated hypothesis. The
hypothesis must be quantitative and
specific. It must predict a
relationship of a specific size.
For example:
Babies who are breast-fed have less illness
than babies who are bottle-fed.

Which illnesses? How is feeding type


defined?How large a difference in risk?
A better example:
Babies who are exclusively breast-fed for
three months or more will have a reduction in
the incidence of hospital admissions for
gastroenteritis of at least 30% over the first
year of life.
Only specific prediction allows one to
draw legitimate conclusions from a
study which tests a hypothesis. But
even with the best formulated
hypothesis, two types of errors can
occur.

Type 1 - observing a difference when in


truth there is none.

Type 2 - failing to observe a difference


when there is one.
These errors are generally produced
by one or more of the following:

RANDOM ERROR
RANDOM MISCLASSIFICATION
BIAS
CONFOUNDING
RANDOM ERROR

Deviation of results and inferences


from the truth, occurring only as a
result of the operation of chance.
Can produce type 1 or type 2 errors.
RANDOM (OR NON-DIFFERENTIAL)
MISCLASSIFICATION

Random error applied to the


measurement of an exposure or
outcome. Errors in classification can
only produce type 2 errors, except if
applied to a confounder or to an
exposure gradient.
BIAS

Systematic, non-random deviation of


results and inferences from the truth, or
processes leading to such deviation. Any
trend in the collection, analysis,
interpretation, publication or review of
data that can lead to conclusions which
are systematically different from the truth.
(Dictionary of Epidemiology, 3rd ed.)
MORE ON BIAS
Note that in bias, the focus is on an
artifact of some part of the research
process (assembling subjects,
collecting data, analyzing data) that
produces a spurious result. Bias can
produce either a type 1 or a type 2
error, but we usually focus on type 1
errors due to bias.
MORE ON BIAS

Bias can be either conscious or


unconscious. In epidemiology, the
word bias does not imply, as in
common usage, prejudice or
deliberate deviation from the truth.
CONFOUNDING
A problem resulting from the fact that one
feature of study subjects has not been
separated from a second feature, and has thus
been confounded with it, producing a spurious
result. The spuriousness arises from the
effect of the first feature being mistakenly
attributed to the second feature. Confounding
can produce either a type 1 or a type 2 error,
but we usually focus on type 1 errors.
THE DIFFERENCE BETWEEN BIAS
AND CONFOUNDING

Bias creates an association that is


not true, but confounding describes
an association that is true, but
potentially misleading.
EXAMPLES OF RANDOM ERROR, BIAS,
MISCLASSIFICATION AND CONFOUNDING
IN THE SAME STUDY:

STUDY: In a cohort study, babies of


women who bottle feed and women who
breast feed are compared, and it is found
that the incidence of gastroenteritis, as
recorded in medical records, is lower in
the babies who are breast-fed.
EXAMPLE OF RANDOM ERROR

By chance, there are more episodes of


gastroenteritis in the bottle-fed group in the
study sample, producing a type 1 error.
(When in truth breast feeding is not
protective against gastroenteritis).
Or, also by chance, no difference in risk
was found, producing a type 2 error (When
in truth breast feeding is protective against
gastroenteritis).
EXAMPLE OF RANDOM
MISCLASSIFICATION
Lack of good information on feeding
history results in some breast-
feeding mothers being randomly
classified as bottle-feeding, and vice-
versa. If this happens, the study
finding underestimates the true RR,
whichever feeding modality is
associated with higher disease
incidence, producing a type 2 error.
EXAMPLE OF BIAS

The medical records of bottle-fed babies


only are less complete (perhaps bottle
fed babies go to the doctor less) than
those of breast fed babies, and thus
record fewer episodes of gastro-enteritis
in them only.
This is called ias because the
observation itself is in error.
EXAMPLE OF CONFOUNDING
The mothers of breast-fed babies are of
higher social class, and the babies thus have
better hygiene, less crowding and perhaps
other factors that protect against
gastroenteritis. Crowding and hygiene are
truly protective against gastroenteritis, but
we mistakenly attribute their effects to breast
feeding. This is called confounding. because
the observation is correct, but its explanation
is wrong.
PROTECTION AGAINST RANDOM
ERROR AND RANDOM
MISCLASSIFICATION
Random error can work to falsely produce
an association (type 1 error) or falsely not
produce an association (type 2 error).
We protect ourselves against random
misclassification producing a type 2 error
by choosing the most precise and accurate
measures of exposure and outcome.
PROTECTION AGAINST TYPE
1 ERRORS
We protect our study against random
type 1 errors by establishing that the
result must be unlikely to have occurred
by chance (e.g. p < .05). P-values
are established entirely to protect
against type 1 errors due to chance, and
do not guarantee protection against
type 1 errors due to bias or confounding.
This is the reason we say statistics
demonstrate association but not
causation.
PROTECTION AGAINST TYPE
2 ERRORS
We protect our study against random type 2
errors by
providing adequate sample size, and
hypothesizing large differences.
The larger the sample size, the easier it will
be to detect a true difference, and the largest
differences will be the easiest to detect.
(Imagine how hard it would be to detect a 1%
increase in the risk of gastroenteritis with
bottle-feeding).
TWO WAYS TO
INCREASE POWER
The sample size needed to detect a significant
difference is called the power of a study.
1. Choosing the most precise and accurate
measures of exposure and outcome has the
effect of increasing the power of our study,
because of variances of the outcome measures,
which enter into statistical testing, are
decreased.
2. Having an adequate sized sample of study
subjects
KEY PRINCIPLE IN BIAS AND
CONFOUNDING
The factor that creates the bias, or the
confounding variable, must be
associated with both the independent
and dependent variables (i.e. with the
exposure and the disease). Association
of the bias or confounder with just one
of the two variables is not enough to
produce a spurious result.
In the example just given:

The BIAS, namely incomplete chart


recording, has to be associated with
feeding type (the independent variable)
and also with recording of
gastroenteritis (the dependent variable)
to produce the false result.

The CONFOUNDING VARIABLE (or


CONFOUNDER) better hygiene, has to be
associated with feeding type and also
with gastroenteritis to produce the
spurious result.
Were the bias or the confounder associated
with just the independent variable or just the
dependent variable, they would not produce
bias or confounding.
This gives a useful rule:
If you can show that a potential confounder is
NOT associated with either one of the two
variables under study (exposure or outcome),
confounding can be ruled out.
GOOD STUDY DESIGN
PROTECTS AGAINST ALL
FORMS OF ERROR
SOME TYPES OF BIAS

1. SELECTION BIAS

Any aspect of the way subjects are


assembled in the study that creates a
systematic difference between the
compared populations that is not
due to the association under study.
2. INFORMATION BIAS
Any aspect of the way information is collected in
the study that creates a systematic difference
between the compared populations that is not due
to the association under study. (some call this
measurement bias). The incomplete chart
recording in the baby feeding example would be a
form of information bias.
Other examples -
Diagnostic suspicion bias
Recall bias
Sometimes biases apply to a population of
studies, rather than to one study, as in publication
bias (tendency to publish papers which show
positive results).

You might also like