You are on page 1of 4

What is Measurement?

How do we dene it?

We usually dene the term measurement as the assignment of values to objects according to some
system of rules. This denition originates with Stevens (1946), who presented what have become the
four traditional scales or types of measurement. Well talk about these shortly. For now, lets focus
on the general measurement process, which involves giving an object, the person or thing for whom
were measuring, a value that represents something about it.

Measurement is happening all the time, all around us. Daily, we measure what we eat, where we go,
and what we do. For example, drink sizes are measured using categories like tall, grande, and venti.
A jog or a commute is measured in miles or kilometers. We measure the temperature of our homes,
the air pressure in our tires, and the carbon dioxide in our atmosphere. The wearable technology you
might have strapped to your wrist could be monitoring your lack of movement and decreasing heart
rate as you doze o reading this sentence. After you wake up, you might check your watch and
measure the length of your nap in minutes or hours.

These are all examples of physical measurement. In each example, you should be able to identify 1)
the object of measurement, 2) the property or quality thats being measured for it, and 3) the kinds of
values that could be used to represent amounts of this quality or property. The property or quality
thats being measured for an object is called the variable. The kinds of values we assign to an object,
for example, grams or degrees Celsius or beats per minute, are referred to as the units of
measurement that are captured within that variable.

So, three things are required for measurement to happen: an object, a variable, and values or units.
Again, the variable is the quality or property we measure, the object is for whom we measure it, and
the values are the numbers or labels we assign. Once you can identify these three components for
each physical measurement example above, make sure you can come up with your own examples
that contain all three parts.

From Physical to Intangible

With most physical measurements, the property that were trying to represent or capture with our
values can be clearly dened and consistently measured. For example, amounts of food are
commonly measured in grams. A cup of cola has about 44 grams of sugar in it. When you see that
number printed on your can of soda pop or zzy water, the meaning is pretty clear, and theres really
no need to question if its accurate. Cola has a lot of sugar in it.

But, just as often, we take a number like the amount of sugar in our food and use it to represent
something abstract or intangible like how healthy or nutritious the food is. A foods healthiness isnt
as easy to dene as its mass or volume. A measurement of healthiness or nutritional value might
account for the other ingredients in the food and how many calories they boil down to. Furthermore,
dierent foods can be more or less nutritional for dierent people, depending on a variety of factors.
Healthiness, unlike physical properties, is intangible and dicult to measure.

The social sciences of education and psychology typically focus on the measurement of constructs,
intangible and unobservable qualities, attributes, or traits that we assume are causing certain
observable behavior or responses. In this course, our objects of measurement are typically people,
and our goal is to give these people numbers or labels that tell us something meaningful about
qualities such as their intelligence, their math ability, or their social anxiety. Constructs such as these
are dicult to measure. Thats why we need an entire course to discuss how to best measure them.
A good question to ask at this point is, how can we measure and provide values for something thats
unobservable? How do we score a persons math ability if we cant observe it directly? What we
need is an operationalization of our construct, an observable behavior or response that increases or
decreases as a person moves up or down on the construct. With math ability, that operationalization
might be the number of math questions a person answers correctly out of 20. With social anxiety, it
might be the frequency of feeling anxious over a given period of time. When using a proxy for our
construct, we have to assume or infer that the operationalization were actually observing and
measuring accurately represents the underlying quality or property that were interested in. This
brings us to the overarching question for this course.

What makes measurement good?

In the last year of my undergraduate in psychology I conducted a research study on the constructs of
aggression, sociability, and victimization with Italian preschoolers (D. A. Nelson, Robinson, Hart,
Albano, & Marshall, 2010). I spent about four weeks collecting data in preschools. Data collection
involved covering a large piece of cardboard with pictures of all the children in a classroom, and then
asking each child, individually, questions about their peers.

To measure sociability, we asked three simple questions: who is fun to talk to? who is fun to do
pretend things with? and who has many friends? Kids with lots of peer nominations on these
questions received a higher score, indicating that they were more sociable. After asking these and
other questions to about 300 preschoolers, and then tallying up the scores, I wondered how well we
were actually measuring the constructs we were targeting. Were these scores any good? Was three or
ve questions enough? Maybe we were missing something important? Maybe some of these
questions, which had to be translated from English into Italian, meant dierent things on the coast of
the Mediterranean than they did in the Midwest US?

This project was my rst experience on the measuring side of measurement, and it fascinated me.
The questions that I asked then are the same questions that well ask and answer in this course. How
consistently and accurately are we measuring what we intend to measure? What can we do to
improve our measurement? And how can we identify instruments that are better or worse than
others? These questions all have to do with what makes measurement good.

Many dierent things make measurement good, from writing high-quality questions and items to
adherence to established test development guidelines. For the most part, the resulting scores are
considered good, or eective, when they consistently and accurately describe a target construct.
Consistency and accuracy refer to the reliability and validity of test scores, that is, the extent to
which the same scores would be obtained across repeated administrations of a test, and the extent to
which scores fully represent the construct they are intended to measure.

These two terms, reliability and validity, will come up many times throughout the course. The
second one, validity, will help us clarify our denition of measurement in terms of its purpose. Of all
the considerations that make for eective measurement, the rst to address is purpose.

What is the purpose?

Measurement is useless unless it is based on a clearly articulated purpose. This purpose describes the
goals of administering a test or survey, including what will be measured, for whom, and why? Weve
already established the what? as the variable or construct, the property, quality, attribute, or trait
that our numbers or values represent. Weve also established the for whom? as the object, in our
case, people, but more specically perhaps students, patients, or employees. Now we need to
establish the why?
The purpose of a test species its intended application and use. It addresses how scores from the test
are designed to be interpreted. A test without a clear purpose cant be eective.

Suppose someone asks you to create a measure of students nancial savvy, that is, their
understanding of money and how its used in nance. Youve got here a simple construct,
understanding of nance, and the object of measurement, students. But before you can develop this
test youd need to know how it is going to be used. Its purpose will determine key features like what
specic content the test contains, the level of diculty of the questions, the types of questions used,
and how its administered. If the test is used as a nal exam in a nance course, it should capture the
content of that course, and it might be pretty rigorous. On the other hand, if its used with the general
student body to see what students know about balancing budgets and managing student loans, the
content and diculty might change. Clearly, you cant develop a test without knowing its purpose.
Furthermore, a test designed for one purpose may not function well for another.

Take a minute to think about some of the tests youve used or taken in the past. How would you
express the purposes of these tests? When answering this question, be careful to avoid simply saying
that the purpose of the test is to measure something. A statement of test purpose should clarify what
can be done with the resulting scores. For example, scores from placement tests are used to
determine what courses a student should take or identify students in need of certain instructional
resources. Scores on admissions tests inform the selection of applicants for entrance to a college or
university. Scores on certication and licensure exams are used to verify that examinees have the
knowledge, skills, and abilities required for practice in a given profession. Table 1.1 includes these
and a few more examples. In each case, scores are intended to be used in a specic way.

Table 1.1: Intended Uses for Some Common Types of Standardized Tests
Test Type Intended Use
AccountabilityHold various people responsible for student learning
Admissions Selection for entrance to an educational institution
Employment Help in hiring and promotion of employees
Exit Testing Check for mastery of content required for graduation
Licensing Verify that candidates are t for practice
Placement Selecting coursework or instructional needs

Heres another example that Ill use throughout this course. Some of my work and research is based
on a type of standardized placement testing that is used to measure student growth over a short
period of time. In addition to measuring growth, scores are also used to evaluate the eectiveness of
intervention programs, where eective interventions lead to positive results for students. My latest
project involved measures of early literacy called myIGDIs (Bradeld et al., 2014). A brochure for
the measures from www.myigdis.com states,

myIGDIs are a comprehensive set of assessments for monitoring the growth and development of
young children. myIGDIs are easy to collect, sensitive to small changes in childrens achievement,
and mark progress toward a long-term desired outcome. For these reasons, myIGDIs are an excellent
choice for monitoring English Language Learners and making more informed Special Education
evaluations.

Note that these are some specic and ambitious claims. Validity evidence is needed to demonstrate
that scores can eectively be used in this way.

The point of these examples is simply to clarify what goes into a statement of purpose, and why a
well articulated purpose is an essential rst step to measurement. Well come back to validation of
test purpose in Chapters 2 and 9. For now, you just need to be familiar with how a test purpose is
phrased and why its important.

Summary

To summarize this section, the measurement process allows us to capture information about
individuals that can be used to describe their standing on a variety of constructs, from educational
ones, like math ability and vocabulary knowledge, to psychological ones, like sociability and
aggression. We measure these properties by operationalizing our construct, for example, in terms of
the number of items answered correctly or the number of times individuals exhibit a certain behavior.
These operational variables are then assumed to represent our construct of interest. Finally, our
measures of these constructs can then be used for specic purposes, such as to inform research
questions about the relationship between sociability and aggression, or to measure growth in early
literacy.

So, measurement involves a construct that we dont directly observe and an operationalization of it
that we do observe. Our measurement is said to be eective when there is a strong connection
between the two, which is best obtained when our measurement has a clear purpose. In the next two
sections, on measurement scales and scoring, well focus on how to handle the operational side of
measurement. Then, with measurement models, well consider the construct side. Finally, in the
section on score referencing, well talk about additional labels that we use to give meaning to our
score

You might also like