Hr-Validity & Reliability

BHAVAN’S INSTITUTE OFMANAGEMENT
MYSORE
SUBJECT : HRM
ASSGIMENT ON : VALIDITY AND RELIABILITY
SUBMITED TO : PROF.ROHINI.G.SHETTY
HR DEPARTMENT
BHAVAN’S INSTITUTE OF
MANAGEMENT
MYSORE
SUBMITTED BY : SRINIVAS.D
IV {TRI-SEM}, P.G.PM
BHAVAN’S INSTITUTE OF
MANAGEMENT
MYSORE
DATE: 14-10-2008
DAY :
TUESDAY
VALIDITY
Introduction
Validity is arguably the most important criteria for the

quality of a test. The term validity refers to whether or not
the test measures what it claims to measure. On a test with
high validity the items will be closely linked to the test’s
intended focus. For many certification and licensure tests
this means that the items will be highly related to a specific
job or occupation. If a test has poor validity then it does not
measure the job-related content and competencies it ought
to. When this is the case, there is no justification for using
the test results for their intended purpose. There are several
ways to estimate the validity of a test including content
validity, concurrent validity, and predictive validity. The face
validity of a test is sometimes also mentioned.
TYPES OF VALIDITY
Content Validity
While there are several types of validity, the most important

type for most certification and licensure programs is
probably that of content validity. Content validity is a logical
process where connections between the test items and the
job-related tasks are established. If a thorough test
development process was followed, a job analysis was
properly conducted, an appropriate set of test specifications
were developed, and item writing guidelines were carefully
followed, then the content validity of the test is likely to be
very high. Content validity is typically estimated by
gathering a group of subject matter experts (SMEs) together
to review the test items. Specifically, these SMEs are
given the list of content areas specified in the test blueprint,
along with the test items intended to be based on each
content area. The SMEs are then asked to indicate whether
or not they agree that each item is appropriately matched to
the content area indicated. Any items that the
SMEs identify as being inadequately matched to the test
blueprint, or flawed in any other way, are either revised or
dropped from the test.
Concurrent Validity
Another important method for investigating the validity of a

test is concurrent validity. Concurrent validity is a statistical
method using correlation, rather than a logical method.
Examinees who are known to be either masters or non-
masters on the content measured by the test are identified,
and the test is administered to them under realistic exam
conditions. Once the tests have been scored, the
relationship is estimated between the examinees’ known
status as either masters or non-masters and their
classification as masters or non-masters (i.e., pass or fail)
based on the test. This type of validity provides evidence
that the test is classifying examinees correctly. The stronger
the correlation is, the greater the concurrent validity of the
test is.
Predictive Validity
Another statistical approach to validity is predictive validity.

This approach is similar to concurrent validity, in that it
measures the relationship between examinees'
performances on the test and their actual status as masters
or non-masters. However, with predictive validity, it is the
relationship of test scores to an examinee's future
performance as a master or non-master that is estimated. In
other words, predictive validity considers the question,
"How well does the test predict examinees' future status as
masters or non-masters?" For this type of validity, the
correlation that is computed is between the examinees'
classifications as master or non-master based on the test
and their later performance, perhaps on the job. This type of
validity is especially useful for test purposes such as
selection or admissions.
Face Validity
One additional type of validity that you may hear mentioned

is face validity. Like content validity, face validity is
determined by a review of the items and not through the use
of statistical analyses. Unlike content validity, face validity
is not investigated through formal procedures and is not
determined by subject matter experts. Instead, anyone who
looks over the test, including examinees and other
stakeholders, may develop an informal opinion as to
whether or not the test is measuring what it is supposed to
measure. While it is clearly of some value to have the test
appear to be valid, face validity alone is insufficient for
establishing that the test is measuring what it claims to
measure. A well developed exam program will include formal
studies into other, more substantive types of validity.
Summary
The validity of a test is critical because, without sufficient

validity, test scores have no meaning. The evidence you
collect and document about the validity of your test is also
your best legal defense should the exam program ever be
challenged in a court of law. While there are several ways to
estimate validity, for many certification and licensure exam
programs the most important type of validity to establish is
content validity.
RELIABILITY
Introduction
Reliability is one of the most important elements of test

quality. It has to do with the consistency, or reproducibility,
of an examinee's performance on the test. For example, if
you were to administer a test with high reliability to an
examinee on two occasions, you would be very likely to
reach the same conclusions about the examinee's
performance both times. A test with poor reliability, on the
other hand, might result in very different scores for the
examinee across the two test administrations. If a test
yields inconsistent scores, it may be unethical to take any
substantive actions on the basis of the test. There are
several methods for computing test reliability including test-
retest reliability, parallel forms reliability, decision
consistency, internal consistency, and interpreter reliability.
For many criterion-referenced tests decision consistency is
often an appropriate choice.
Types of Reliability
Test-Retest Reliability
To estimate test-retest reliability, you must administer a test

form to a single group of examinees on two separate
occasions. Typically, the two separate administrations are
only a few days or a few weeks apart; the time should be
short enough so that the examinees' skills in the area being
assessed have not changed through additional learning. The
relationship between the examinees' scores from the two
different administrations is estimated, through statistical
correlation, to determine how similar the scores are. This
type of reliability demonstrates the extent to which a test is
able to produce stable, consistent scores across time.
Parallel Forms Reliability
Many exam programs develop multiple, parallel forms of an

exam to help provide test security. These parallel forms are
all constructed to match the test blueprint, and the parallel
test forms are constructed to be similar in average item
difficulty. Parallel forms reliability is estimated by
administering both forms of the exam to the same group of
examinees. While the time between the two test
administrations should be short, it does need to be long
enough so that examinees' scores are not affected by
fatigue. The examinees' scores on the two test forms are
correlated in order to determine how similarly the two test
forms function. This reliability estimate is a measure of how
consistent examinees’ scores can be expected to be across
test forms.
Decision Consistency
In the descriptions of test-retest and parallel forms

reliability given above, the consistency or dependability of
the test scores was emphasized. For many criterion
referenced tests (CRTs) a more useful way to think about
reliability may be in terms of examinees’ classifications. For
example, a typical CRT will result in an examinee being
classified as either a master or non-master; the examinee
will either pass or fail the test. It is the reliability of this
classification decision that is estimated in decision
consistency reliability. If an examinee is classified as a
master on both test administrations, or as a non-master on
both occasions, the test is producing consistent decisions.
This approach can be used either with parallel forms or with
a single form administered twice in test-retest fashion.
Internal Consistency
The internal consistency measure of reliability is frequently

used for norm referenced tests (NRTs). This method has the
advantage of being able to be conducted using a single form
given at a single administration. The internal consistency
method estimates how well the set of items on a test
correlate with one another; that is, how similar the items on
a test form are to one another. Many test analysis software
programs produce this reliability estimate automatically.
However, two common differences between NRTs and CRTs
make this method of reliability estimation less useful for
CRTs. First, because CRTs are typically designed to have a
much narrower range of item difficulty, and examinee
scores, the value of the reliability estimate will tend to be
lower. Additionally, CRTs are often designed to measure a
broader range of content; this results in a set of items that
are not necessarily closely related to each other. This aspect
of CRT test design will also produce a lower reliability
estimate than would be seen on a typical NRT.
Interrater Reliability
All of the methods for estimating reliability discussed thus

far are intended to be used for objective tests. When a test
includes performance tasks, or other items that need to be
scored by human raters, then the reliability of those raters
must be estimated. This reliability method asks the
question, "If multiple raters scored a single examinee's
performance, would the examinee receive the same score.
Interrater reliability provides a measure of the dependability
or consistency of scores that might be expected across
raters.
Summary
Test reliability is the aspect of test quality concerned with

whether or not a test produces consistent results. While
there are several methods for estimating test reliability, for
objective CRTs the most useful types are probably test-
retest reliability, parallel forms reliability, and decision
consistency. A type of reliability that is more useful for NRTs
is internal consistency.
For performance-based tests, and other tests that use
human raters, interrater reliability is likely to be the most
appropriate method.
Bibliography
www.proftesting.com
www.google.com

Hr-Validity & Reliability

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hr-Validity & Reliability

Uploaded by

Copyright:

Available Formats

BHAVAN’S INSTITUTE OFMANAGEMENT

ASSGIMENT ON : VALIDITY AND RELIABILITY

Validity is arguably the most important criteria for the

While there are several types of validity, the most important

Another important method for investigating the validity of a

Another statistical approach to validity is predictive validity.

One additional type of validity that you may hear mentioned

The validity of a test is critical because, without sufficient

Reliability is one of the most important elements of test

To estimate test-retest reliability, you must administer a test

Parallel Forms Reliability

Many exam programs develop multiple, parallel forms of an

In the descriptions of test-retest and parallel forms

The internal consistency measure of reliability is frequently

All of the methods for estimating reliability discussed thus

Test reliability is the aspect of test quality concerned with

You might also like

Hr-Validity &amp; Reliability

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Hr-Validity &amp; Reliability

Uploaded by

Copyright:

Available Formats

BHAVAN’S INSTITUTE OFMANAGEMENT

ASSGIMENT ON : VALIDITY AND RELIABILITY

Validity is arguably the most important criteria for the

While there are several types of validity, the most important

Another important method for investigating the validity of a

Another statistical approach to validity is predictive validity.

One additional type of validity that you may hear mentioned

The validity of a test is critical because, without sufficient

Reliability is one of the most important elements of test

To estimate test-retest reliability, you must administer a test

Parallel Forms Reliability

Many exam programs develop multiple, parallel forms of an

In the descriptions of test-retest and parallel forms

The internal consistency measure of reliability is frequently

All of the methods for estimating reliability discussed thus

Test reliability is the aspect of test quality concerned with

You might also like

Hr-Validity & Reliability

Hr-Validity & Reliability