You are on page 1of 3

INSTRUMENT VALIDITY AND RELIABILITY

INSTRUMENT (Construct) – generic term that researchers use for a measurement device: survey, test,
questionnaire, etc.
 Researcher-Completed – administered by researchers
 Subject-Completed – completed by participants

VALIDITY – the degree to which a test measures what it is supposed to measure. (ACCURACY)

 Construct – Related Evidence – involves the extent to which


certain explanatory concepts or qualities account for performance
Example: A personality test can be studied to see how well
theoretical implications of the typologies account for the actual
results obtained.
o Discriminant Validity – instrument does not
correlate significantly with variables from which it should differ
o Convergent Validity – instrument correlates highly
with other variables with which it should theoretically correlate.
 Content Validity (also known as Face Validity) – test that
adequately represents the property being measured. (Does the
instrument contain a representative sample of the content being
assessed?)
Example: A test that is intended to measure the quality of science
instruction in fifth grade should cover material covered in the fifth grade science course in a
manner appropriate for their grade.
 Criterion Validity – test measures with some future or current criteria; requires second
measurement
o Predictive Validity – instrument that measures future performance; comparison must
be made between the instrument and later behavior that it predicts
Example: Screening test for five-year-olds to predict success in kindergarten:
prescreening prior to entry into kindergarten and kindergarten performance would be
assessed at the end of the year; a correlation would be calculated between the two
scores
o Concurrent Validity – compares scores on an instrument with current performance on
some other measure; requires second measure at about the same time
Example 1: Validity for a science test could be investigated by correlating scores for the
test with scores from another established science test taken about the same time
Example 2: Administer the instrument to two groups who are known to differ on the
trait being measured. Instrument is valid if the scores for the two groups were very
different
RELIABILITY – consistently measure what it intends to measure (STABILITY and CONSISTENCY)
 Test-retest – measures consistency from one time to
the next
Example: Same instrument is given twice to the same group of
people
 Equivalent-form (Parallel or Alternate)– measures
consistency between two versions of an instrument
Example: Both versions of the instrument measure the same
thing. The same subjects complete both instruments during the
same period.
 Internal-consistency – measures consistency (among
the questions) within the instruments; easiest form of reliability
to investigate
Example: Subjects complete one instrument one time.
o Split-Half – randomly divides all items to
measure the same construct into two sets. (usually odd number
questions with even number questions or fist half with the
second half.
o Kuder-Richardson Formula 20 and 21 – items on the instrument must be scored (0 for
incorrect and 1 for correct)
o Cronbach’s Alpha – used to measure the internal consistency; often the case with
attitude instruments that use the Likert Scale.
 Scoring Agreement – consistency of rating a performance or product among different judges

Relationship of Test Forms and Testing Sessions Required for Reliability Procedures

Test Forms Required


Testing Sessions Required
One Two

Split-Half
One
Kuder-Richardson Equivalent (Alternative)-Form
Cronbach's Alpha
Two
Test-Retest

Validity and Reliability Compared

The two do not necessarily go hand-in-hand.


At best, we have a measure that has both high validity and high reliability. It yields consistent results in
repeated application and it accurately reflects what we hope to represent.

It is possible to have a measure that has high reliability but low validity - one that is consistent in getting
bad information or consistent in missing the mark. *It is also possible to have one that has low reliability
and low validity - inconsistent and not on target.

Finally, it is not possible to have a measure that has low reliability and high validity - you can't really get
at what you want or what you're interested in if your measure fluctuates wildly.

You might also like