Professional Documents
Culture Documents
DEVELOPMENT OF MEASUREMENT:
Etymological: The word measurement comes from the word measaure which means metria
in Greek, means ―a measuring‘. To know an object or a person really means to be able to describe
him accurately and comprehensively. But any description of an object or phenomenon or person is
selective. It is multi dimension, a phenomenon of a person cannot be measured, yes, we can measure
the characteristic of a person or phenomenon. Evaluation is a process of both formal and informal
and continuous to know the objects and persons.
Psychological measurement was a part of philosophy; it starts to measure the mind or soul.
After 1850 psychology change into the idea of quantitative measurement in terms of the amount of
forgetting, level of intelligence. By 1900 Psychology started its measurement technique in all
directions. Any attempts to measure human behavior through experiments were ridicule but growths
of experimentation, Darwinian demonstration, and clinical adjustment of individuals start the basic
foundations of psychological measurements.
b) Development in the nineteenth century
Galton was the first man to undertake systematic and statistical investigation of individual
difference. He was followed by Weber, B. Alexander‘s ―Senses and intellect‖ and ―the emotions and
will‖ which leads to the development of a number of scientific tools for measurement. The first
psychological laboratory was setup in 1879 at university of Leipzig. Thus, Karl Pearson developed a
technique ―The product-moment method‖ for computing the correlation co-efficient and Binet
developed the different intelligence scales.
1
c) Development in the twentieth century
This period start with exploration and standardization of initial development and methods.
Spelling tests, arithmetic tests, language test and Thorndike handwriting scale along with the group
test of intelligence. Then, tests are standardized for multiplied, data sheet, questionnaires and
inventories came into being. Large scale batteries for educational and personal use are generated as
tools. Binet-Simon Intelligence scale was one such scale. Army Alpha, beta test and so on developed
a new methodology known as psychometric theory.
Topical Development:
a) Contribution of Galton
b) Contribution of Alfred Binet
Individual test, performance test, aptitude test, group test, multifactor test, personality test, batteries
test, ratting scale, inventories, etc
Thus, scientific interest and method spreads from biological science and end with
development in behavioral science, rate of learning, and time complexity of mental task are
measured and interpreted in terms of statistical design and techniques. Analyzing and describing
individual difference was made simple; measurement of intelligence was replaced by oral and
written examinations and test in classroom. Measurements establish its own problem of study,
factors, and its scope. It expands learning theory to all the senses organs. Measurement can establish
new relationship and functioning of memory, mental image, attention, feelings strength, skill and
value judgment.
CONCEPT OF MEASUREMENT:
According to N. R. Campbell, (1920) defines the concepts of measurement in the following ways:
(i) The assignment of numerals to represent properties.
(ii) The process of assigning numbers to represent qualities.
(iii) The assignment of numerals to represent properties according to scientific laws.
(iv) The assignment of numerals to things so as to represent facts or conventions about them.
According to S.S. Stevens, the concept of measurement is the assignment of numerals to objects or
events or aspects of object according to rules - according to any rule.
2
MEANING OF MEASUREMENT:
Physical measurement exists in the physical and material world and is concerned primarily
with the dimension like age, weight, length, capacity, etc. These measures are quantitative and
therefore require units like years, month, kilogram, metres, litres etc. while mental measurement is
qualitative as against the physical measurement which is quantitative. It is not precise. It is
subjective and indefinite.
3
FUNCTIONS OF MEASUREMENT:
1. Prognosis function.
The Function tells about the difference among their performances at the movement. The
prognosis has the administrative function such as classification, selection, promotion and gradation
of students. All decisions involve prediction when psychological test is mentioned, so called I.Q. test
administered to students in school to predict their academic performance come to mind and
prediction of future behavior.
2. Diagnosis function.
Diagonosis function identifies the weakness of the student-learning. The remedial instruction
can be prepared on the basis of diagnosis. It also implies the prediction but there is considerable
justification in listing diagnosis of a separate function of measurement. It establish the cause –effect
relationship thereby improvement in instructional procedure.
3. Research function.
Measurement provides a more objective and dependable basis for comparison than dots rough
impressions. Test scores are quantified in real and useful variable. Scientific hypothesis are verified
with the help of measurement.
There is need for measurement in education and psychology for a large number of reasons and
purposes. Educational or psychological measurement is simply the means by which qualitative
aspects of human behavior re observed with greater accuracy. The purposes are to make possible
more accurate prediction and control in the educational process.
An effective utilization of manpower is essential in any modern society. We need the levels
of aptitudes and the combinations in the development of different types of behavior in each
vocational area. Measurement provides a feedback in all aspects of educational planning so
that our educational programmes are oriented and updated from time to time.
Educational placement
There are two overall functions of education – the integrative and the differentiative.
Integrative education is according to individual aptitudes designed to make people alike in
their ideals, values, virtues, language and general intellectual and social adjustment. It is also
known as general education. Differentiate education is designed to prepare individuals for the
4
required professions and specialities. General education adapted the curriculum to measured
aptitudes and abilities of the students. Where differentiative function the students are selected
in terms of their ability to succeed in various professional and specialized courses.
Measurement is done for selecting students who will succeed in a given curriculum.
Counseling is concerned with measurement as an aid to help the individual student to find the
vocation, college curriculum and social environment which will ensure his successful
adjustment. Aptitudes, interest, traits, skill, achievement profile are assesses with a view to
enable him to make an optimum vocational or educational or social adjustment. The use of
psychological measurement in counseling are the objective appraisal of personality for better
self-understanding and self – direction, improved basis of prediction, achievement and
growth and measure of capacity, measurement diagnosis the mental disability, deficiencies
and aberrations and used in the evaluation as the outcome of counseling.
Improvement of instruction
The general ability and standardized educational methods using uniform contents,
assignments, methods of teaching and learning and examinations appropriate to the level of
achievement. Measurement helps to identify the differences within the groups, learning their
skills area but also in terms of interest in a topic or in terms of personality and social needs.
Reporting the marks or letter grades to parents is not consistent with the policy of meeting
the needs in skill area but also of individual pupils.
Effective learning should result in complex behavior patterns which may be differentiated
into higher or lower degrees of habits, skills, understanding, feeling, etc. The rate of
development of a given trait and the level of development attained at maturity, also because
the various traits of an individual develop at different rates and reach different levels of
maturity.
5
Increase of accountability
Testing and measurement have received immense impetus from such recent educational
movements as excellence, effective schooling, minimum competency and above all public
accountability. The pressure of these movements brought more effective and different kinds
of tests to schools. It has heightened the demand of policy-makers and the public for detailed
information about the test and test results. Accountability in teaching , administration,
counseling, curriculum construction, instructional design for better performance. Individuals
accountability in testing can be made with the help of measurement.
Value of testing in Education
Students and teachers depend on the immediate and ultimate rewards or satisfactions
obtained from their efforts. Measurement helps to establish creative values, work products,
good administrator, and excellence students and their teacher.
Nominal level
Ordinal or Rank level
Interval level
Ratio level
Nominal (categorical) scores - when a score places people or things into a category these
are called nominal scores. Nominal scores cannot be ranked or ordered along any
dimension. The categories must be exhaustive and mutually exclusive.
Ordinal scores - means people or things are rank ordered along some dimension. No
common unit of measurement exists between rankings in a system of ordinal scores.
Comparisons cannot be made across different group rankings.
Interval scores - These scores have a common unit of measurement between adjacent
points. No true zero point exists on the interval scale.
Ratio scores - These scores have a common unit of measurement between adjacent scores.
Ratio scores have a true zero point.
6
EVALUATION
Epilogue:
DEVELOPMENT OF EVALUATION:
Etymological: The word Evaluation comes from French word evaluer meaning "to find the
value of."
Evaluation means that you gather information to draw conclusions and make new
predictions. It starts with the difference between evaluation and research. First, problem
selection and definition in research is the responsibility of the research, where as in
evaluation the context of the study almost completely defines the problem for investigation.
Second, research hypotheses are usually derived by deduction from the theories or by
induction from the organized body of knowledge. Precise hypothesis can rarely be generated
and the task more usually becomes that of testing generalization derived from previous
knowledge and experience. Third, every evaluation study is unique. Fourth, Evaluation have
to be conducted in the presence of a multitude of variables which could have relevance in the
interpretation of results with randomization generally impossible or impractical to
accomplish. Data to be collected are heavily influenced in evaluation. Sixth, there is value
judgment. According to B. Bloom, Evaluation is the collecting and analyzing evidence of the
extent to which various groups “see value - see worth in - stated objectives”
7
CONCEPT OF EVALUATION:
1. Evaluation is the process of delineating, obtaining and providing useful information for judging
decision alternatives.
2. According to Stufflebearn, 1971, The adequacy of an evaluation may therefore be assessed on five
criteria:
---Reliability (is the information what the decision maker needs?)
---Pervasiveness (does the information reach all decision makers who need it?)
---Credibility (is the information trusted by the decision maker and those he must serve?)
Planning decisions --------- context evaluation: identify what improvement are needed in a
part of the educational system by specifying the major goals and specific
objectives to be served. It defines the context to be served, to question the
value of any stated goals, to identify and assess needs and unused
opportunities and to identify and delineate problems underlying the needs.
Implementation decisions --------- Process evaluation: thus involves monitoring the educational
activities in order to aid the decision-maker responsible for controlling
their execution. The aim is to identify, and if possible anticipate any
defects in the design of the project or its implementation. The process
evaluator therefore accepts the programme as it is and as it evolves, and
focuses whatever evaluative technique may be appropriate on the most
crucial aspects of the project, in order to build up an account of what
actually happen. Process evaluation may thus have both a formative
function and a summative function.
8
MEANING OF EVALUATION:
Evaluation involves assessing the strengths and weaknesses of programs, policies, personnel,
products, and organizations to improve their effectiveness.
Evaluation is the systematic collection and analysis of data needed to make decisions, a process in
which most well-run programs engage from the outset. Here are just some of the evaluation activities
that are already likely to be incorporated into many programs or that can be added easily:
Pinpointing the services needed for example, finding out what knowledge, skills, attitudes, or
behaviors a program should address
Establishing program objectives and deciding the particular evidence (such as the specific
knowledge, attitudes, or behavior) that will demonstrate that the objectives have been met. A
key to successful evaluation is a set of clear, measurable, and realistic program objectives. If
objectives are unrealistically optimistic or are not measurable, the program may not be able
to demonstrate that it has been successful even if it has done a good job
Developing or selecting from among alternative program approaches for example, trying
different curricula or policies and determining which ones best achieve the goals
Tracking program objectives for example, setting up a system that shows who gets services,
how much service is delivered, how participants rate the services they receive, and which
approaches are most readily adopted by staff
Trying out and assessing new program designs determining the extent to which a particular
approach is being implemented faithfully by school or agency personnel or the extent to
which it attracts or retains participants.
Through these types of activities, those who provide or administer services determine what to offer
and how well they are offering those services. In addition, evaluation in education can identify
program effects, helping staff and others to find out whether their programs have an impact on
participants' knowledge or attitudes.
The different dimensions of evaluation have formal names: process, outcome, and impact evaluation.
Rossi and Freeman (1993) define evaluation as "the systematic application of social research
procedures for assessing the conceptualization, design, implementation, and utility of ... programs."
There are many other similar definitions and explanations of "what evaluation is" in the literature.
Our view is that, although each definition, and in fact, each evaluation is slightly different, there are
several different steps that are usually followed in any evaluation. It is these steps which guide the
questions organizing this handbook. An overview of the steps of a "typical" evaluation follows.
9
Process Evaluations
Process Evaluations describe and assess program materials and activities. Examination of materials
is likely to occur while programs are being developed, as a check on the appropriateness of the
approach and procedures that will be used in the program. For example, program staff might
systematically review the units in a curriculum to determine whether they adequately address all of
the behaviors the program seeks to influence. A program administrator might observe teachers using
the program and write a descriptive account of how students respond, then provide feedback to
instructors. Examining the implementation of program activities is an important form of process
evaluation. Implementation analysis documents what actually transpires in a program and how
closely it resembles the program's goals. Establishing the extent and nature of program
implementation is also an important first step in studying program outcomes; that is, it describes the
interventions to which any findings about outcomes may be attributed. Outcome evaluation assesses
program achievements and effects.
10
THE PROCESS OF EVALUATION INCLUDES THE FOLLOWING:
The teaching learning process
The teacher
The student – progress
The Parents
The administrators and supervisors
Guidance cell
Agencies
Outcome Evaluations
Outcome Evaluations study the immediate or direct effects of the program on participants. For
example, when a 10-session program aimed at teaching refusal skills is completed, can the
participants demonstrate the skills successfully? This type of evaluation is not unlike what happens
when a teacher administers a test before and after a unit to make sure the students have learned the
material. The scope of an outcome evaluation can extend beyond knowledge or attitudes, however,
to examine the immediate behavioral effects of programs.
Impact Evaluations
Impact Evaluations look beyond the immediate results of policies, instruction, or services to identify
longer-term as well as unintended program effects. It may also examine what happens when several
programs operate in unison. For example, an impact evaluation might examine whether a program's
immediate positive effects on behavior were sustained over time. Some school districts and
community agencies may limit their inquiry to process evaluation. Others may have the interest and
the resources to pursue an examination of whether their activities are affecting participants and
others in a positive manner (outcome or impact evaluation). The choices should be made based upon
local needs, resources, and requirements.
Regardless of the kind of evaluation, all evaluations use data collected in a systematic manner. These
data may be quantitative such as counts of program participants, amounts of counseling or other
services received, or incidence of a specific behavior. They also may be qualitative such as
descriptions of what transpired at a series of counseling sessions or an expert's best judgment of the
age-appropriateness of a skills training curriculum. Successful evaluations often blend quantitative
and qualitative data collection. The choice of which to use should be made with an understanding
that there is usually more than one way to answer any given question.
11
Need of Evaluation:
Evaluations serve many purposes. Before assessing a program, it is critical to consider who is most
likely to need and use the information that will be obtained and for what purposes. Listed below are
some of the most common reasons to conduct evaluations. These reasons cut across the three types
of evaluation just mentioned. The degree to which the perspectives of the most important potential
users are incorporated into an evaluation design will determine the usefulness of the effort.
Administrators are often most interested in keeping track of program activities and
documenting the nature and extent of service delivery. The type of information they seek to
collect might be called a "management information system" (MIS). An evaluation for project
management monitors the routines of program operations. It can provide program staff or
administrators with information on such items as participant characteristics, program
activities, allocation of staff resources, or program costs. Analyzing information of this type
(a kind of process evaluation) can help program staff to make short-term corrections
ensuring, for example, that planned program activities are conducted in a timely manner.
This analysis can also help staff to plan future program direction such as determining
resource needs for the coming school year.
Operations data are important for responding to information requests from constituents, such
as funding agencies, school boards, boards of directors, or community leaders. Also,
descriptive program data are one of the bases upon which assessments of program outcome
are built it does not make sense to conduct an outcome study if results can not be connected
to specific program activities. An MIS also can keep track of students when the program ends
to make future follow-up possible.
Evaluation can help to ensure that project activities continue to reflect project plans and
goals. Data collection for project management may be similar to data collection for staying
on track, but more information might also be needed. An MIS could indicate how many
students participated in a prevention club meeting, but additional information would be
needed to reveal why participants attended, what occurred at the meeting, how useful
participants found the session, or what changes the club leader would recommend. This type
of evaluation can help to strengthen service delivery and to maintain the connection between
program goals, objectives, and services.
Evaluation can help to streamline service delivery or to enhance coordination among various
program components, lowering the cost of service. Increased efficiency can enable a program
to serve more people, offer more services, or target services to those whose needs are
greatest. Evaluation for program efficiency might focus on identifying the areas in which a
program is most successful in order to capitalize upon them. It might also identify
12
weaknesses or duplication in order to make improvements, eliminate some services, or refer
participants to services elsewhere. Evaluations of both program process and program
outcomes are used to determine efficiency.
When it comes to evaluation for accountability, the users of the evaluation results likely will
come from outside of program operations: parent groups, funding agencies, elected officials,
or other policymakers. Be it a process or an outcome evaluation, the methods used in
accountability evaluation must be scientifically defensible, and able to stand up to greater
scrutiny than methods used in evaluations that are intended primarily for "in-house" use. Yet
even sophisticated evaluations must present results in ways that are understandable to lay
audiences, because outside officials are not likely to be evaluation specialists.
Since there is no single, ―best‖ approach to evaluation which can be used in all situations, it
is important to decide the purpose of the evaluation, the questions you want to answer, and
which methods will give you usable information that you can trust. Even if you decide to hire
an external consultant to assist with the evaluation, you, your staff, and relevant stakeholders
should play an active role in addressing these questions. You know the project best, and
ultimately you know what you need. In addition, because you are one of the primary users of
evaluation information, and because the quality of your decisions depends on good
information, it is better to have ―negative‖ information you can trust than ―positive‖
information in which you have little faith. Again, the purpose of project-level evaluation is
not just to prove, but also to improve.
People who manage innovative projects have enough to do without trying to collect
information that cannot be used by someone with a stake in the project. By determining who
will use the information you collect, what information they are likely to want, and how they
are going to use it, you can decide what questions need to be answered through your
evaluation.
13
TYPES OF EVALUATION:
According to MacDonald (1976) has elucidated the complexity of the variety of context of valuation
by characterizing three style of evaluation:
Bureaucratic: It is an unconditional service to those government agencies which have major control
over the allocation of educational resources. The evaluator accepts the values of those
who hold office, and offers information which will help them to accomplish their
policy objectives. He acts as a management consultant, and his criterion of success is
client satisfaction. His technique of study must be credible to the policy-makers and
not lay them open to public criticism. He has no independence, no control over the use
made of his information, and no court of appeal. The report is owned by the
bureaucracy and lodged in its files.
Autocratic: It is a conditional service to those government agencies which have major control over
the allocation of educational resources. It offers external validation of policy in
exchange for compliance with its recommendations. Its values are derived from the
evaluator‘s perception of the constitutional and moral obligation of the bureaucracy.
He focuses upon issues of educational merit and acts as expert adviser. His techniques
of study must yield scientific proofs, because his power base is the academic research
community. His contractual arrangement guarantee non-interference by the client, and
he retains ownership of the study. His report is lodged in the fild of the bureaucracy,
but is also published in academic journals. If his recommendations are rejected, policy
is not validated. His court of appeal is the research community, and high levels in the
bureaucracy.
14
PURPOSES OF EVALUATION:
Purposes of Evaluation, Program evaluations are typically conducted to accomplish one, two or all
of the following.
PRINCIPLES OF EVALUATION:
It is said that evaluation should always be regarded as a process that is guided by principles.
The Principles that govern the operation of evaluation process are as follows:
Motivation
Accountability
Equipment
Placement
Diagnosis
Evaluation of learning
Prediction
Program Evaluation
15
STANDARDS OF MEASUREMENT AND EVALUATION — MEASUREMENT THEORY
Reliability
Validity
Usability
Objectivity
Sensitivity
16
DIFFERENCE BETWEEN MEASUREMENT AND EVALUATION
MEASUREMENT EVALUATION
1 Measurement refers to the process by which the 1 Evaluation is perhaps the most complex and least understood
attributes or dimensions of some physical object are of the terms. Inherent in the idea of evaluation is "value
determined.
2 Measurement is the process of gathering data, and the 2 Evaluation is the process of making judgment about measured
narrowest and involves comparative judgments, data; it is the technique for value judgment.
3 Without measurement there is no positive assurance that 3
the judgments are accurate , Without data there is no
proof that a real problem exists, Without measurement
there is no assurance that training efforts have achieved
their objectives
4 Measurement is objective, measurement remains 4 Evaluation may be subjective, evaluation depends on the mind
constant whoever measures frame of doer
5 For example taking out 1 kg of rice is measurement 5 but determining as good quality or otherwise is evaluation
6 Measurement can be a valuable input into evaluation; it 6 Evaluation is really composed of three components parts: ―e‖,
should never be equated with it. ―value‖, and ―action‖. The central element of the concept of
evaluation is value. Outcome of an evaluation is a judgment
7 Measurement is associated with organization. 7 Evaluation is associated with organizational process.
8 Measurement is to provide high-quality information to 8
assist in learning and improvement, rather than just
monitor goal achievement.
9 Measurement is the most fundamental management 9
system; it includes the following: management,
motivation, service, and training.
10 Measurement directs behavior, increases the visibility of 10
performance, increase alignment, improves decision
making and problem solving and gives early warning
signal.
11 Measurement enables prediction and understanding 11
12 12
17
MEASUREMENT EVALUATION
1 Measurement provides data 1 Evaluation interprets the data provided by measurement
2 Measurement is only a part of the system of examination 2 Evaluation is a comprehensive whole of the system of examination
3 Measurement suffers from limitations and shortcoming 3 Evaluation is an attempt to remove these limitations and
shortcomings
4 Measurement is restricted to quantitative description of pupil 4 Evaluation includes both quantitative and qualitative description of
behavior. pupil value judgments of that behavior.
5 Measurement tools may not be in a position to provide data 5 Evaluation endeavors to cover all aspects of the process of education
on many educational factors.
6 Measurement is like a product obtained after testing 6 Evaluation is like a process developing out of the products of testing
7 Equal interval and ratio scales are used 7 Nominal, ordinal equal interval are used
8 Its functions are prognosis, diagnosis and research 8 Its functions are selection, grading, guidance, prediction and
diagnosis
9 Formal process are planed 9 Both formal and informal are planned
10 It can be done at any time and space 10 It is a continuous process
11 It is content centered 11 It is objective centered
12 One dimensional in relation to its environment 12 It is multi-dimensional in nature to is environment.
18
NATURE, SCOPE, NEED, TYPES AND LIMITATIONS
OF EDUCATIONAL MEASUREMENT &
EVALUATION
NATURE OF EDUCATIONAL MEASUREMENT
Thorndike wrote that ―the nature of educational measurement is the same as that of all
scientific measurement‖ Educational measurement includes mental measurement with
physical measurement.
i) Measurement in education is quantitative in nature when express the result in
quantitative,
ii) Measurement is expressed in constant units.
Q. What is the value of measuring accurately the results of teaching?
Answer: The common answer to the above is a common sense answer, logical reasons or
experimental evidence. Logical reason for the value of accurate measurements by means of
standardized test is a generally accepted principle that in any field of human.
Natures of educational measurement are as follows:
a) It should be objective, reliable, and valid
b) It should be comprehensive and precise
c) It should be usable and practicable
- The usability implies the following features:
i) Ease in administering the tool
ii) Ease in scoring the answer scripts
iii) Ease in interpreting scores
iv) It should be economical from time, energy and money point of view.
Proficiency Measurement:
To assess the general ability of a person at a given time, it is reasonable expectation
of what abilities learners of a given status should posses. National level selection in
different states and university jurisdictions can be taken as a typical example.
Apart from the characteristic, measurement has its own limitations too:
i) The most important limitation of measurement is that it is quite difficult to decide
about the nature of the object to be nature.
ii) Its scope is narrow and quite limited.
iii) Measurement fails in making clear out distinction between two traits such as
character and personality, achievement and aptitude etc. resulting into low level of
measurement.
iv) The process of measurement is often complex.
v) The traits measured under measurement are concrete and abstract. They have
different meaning for different categories, as a result of it is not so accurate as
physical measurement.
vi) Measurement only provides information rather than any kind of decision.
vii) One of the most important limitations of measurement is that the characteristics
measured under measurement are not physical and fixed. They are continually
changeable.
viii) In the absence of knowledge of the dimensions of educational characteristics the
measurement process lacks accuracy in comparison to physical measurement.
21
NATURE OF EDUCATIONAL EVALUATION
Evaluation includes all the means of collecting information about the students‘ learning.
The evaluator should make use of tests, observation, interview, rating scale, check list,
intuition and value judgment to gather complete and reliable information about the students.
Some of the following characteristics are the nature of educational evaluation:
According to Arora and Vashist, the scope of modern educational evaluation are as follows:
22
NEED OF EDUCATIONAL EVALUATION
The Educational Evaluation has come a long way since its initiation by Ralph Tyler
more than half a century ago. A thorough educational evaluation or psycho-educational
evaluation will provide you with your child's educational strengths, weaknesses, and
recommendations for educational interventions. Remember that a student's inability to stay
on task, hyperactivity, distractibility, and/or impulsivity will affect her performance on the
educational evaluation. There is usually no certification or license for an educational
evaluation. With this in mind, it is a priority to get information from the child's teachers:
classroom teacher, special subject teachers (music, art, physical education, and computer),
lunchroom monitor, playground monitors, and others who come into contact with the child
in the school setting.
According to Dr. B.S. Bloom (1971) the need of educational evaluation are as follows:
a) To discover the extent of competence which the student has developed in initiating
organizing and improving his day today working.
b) To diagnose the strengths and weakness of the learner with a view to guide him in
future.
c) To predict the educational practices which a student-teacher can best make use of.
d) At the end of career to certify the students‘ degree of competency in a particular
field.
e) To provide information to enable each pupil to develop his potentialities within the
framework of educational programme.
23
Purpose/function of educational evaluation
24
Truth destroys error, and Love destroys hate,
According to Neil J. Salkind, (2009) Observed Score = True Score + Error Score, the less
error, the more reliable – it‘s that simple, in other words, reduce the error and increase the
reliability.
Error is the amount of deviation in a physical quantity that arises as a result of the
process of measurement or approximation. Another term for error is uncertainty. Physical
quantities such as weight, volume, temperature, speed, or time must all be measured by an
instrument of one sort or another. No matter how accurate the measuring tool—be it an
atomic clock that determines time based on atomic oscillation or a laser interferometer that
measures distance to a fraction of a wavelength of light some finite amount of uncertainty is
involved in the measurement. Thus, a measured quantity is only as accurate as the error
involved in the measuring process. In other words, the error, or uncertainty, of a
measurement is as important as the measurement itself.
As the resolution of the measurement increases, the accuracy increases and the error
decreases. For example, if the measurement were performed again using a cup as the unit of
measure, the resultant volume would be more accurate because the fractional unit of water
remain ing—less than a cup—would be a smaller volume than the fractional gallon. If a
teaspoon were used as a measuring unit, the volume measurement would be even more
accurate, and so on.
As the example above shows, error is expressed in terms of the difference between
the true value of a quantity and its approximation. A positive error is one in which the
observed value is larger than the true value; in a negative error, the observed value is
smaller. Error is most often given in terms of positive and negative error. For example, the
volume of water in the bathtub could be given as 6 gallons +/-0.5 gallon, or 96 cups +/-0.5
25
cup, or 1056 teaspoons +/-0.5 teaspoons. Again, as the uncertainty of the measurement
decreases, the value becomes more accurate.
An error can also be expressed as a ratio of the error of the measurement and the true
value of the measurement. If the approximation were 25 and the true value were 20, the
relative error would be 5/20. The relative error can be also be expressed as a percent. In this
case, the percent error is 25%.
Measurement error can be generated by many sources. In the bathtub example, error
could be introduced by poor procedure such as not completely filling the bucket or
measuring it on a tilted surface. Error could also be introduced by environmental factors
such as evaporation of the water during the measurement process. The most common and
most critical source of error lies within the measurement tool itself, however. Errors would
be introduced if the bucket were not manufactured to hold a full gallon, if the lines
indicating quarter gallons were incorrectly scribed, or if the bucket incurred a dent that
decreased the amount of water it could hold to less than a gallon.
Measurement Error
Knowledge gained from the study of measurement science will make clinicians
more or less certain of their interpretations of the research summarized above and their
confidence in the values reported as the true amount of axial rotation permitted by the
orthoses. For example, measurement theory shows one can never absolutely measure the
true quantity of a concept. Every measure taken by clinicians or scientists has a shadow
component, termed error.
The error associated with a measurement is defined as the difference between the
unknowable true score and the observed score recorded while taking measurements. Since
in theory the true score always remains unknown, it is crucial to estimate the errors
associated with observed scores, or measures, as a means of establishing confidence in the
measuring devices and procedures. This can be done by taking repeated measures of the
same phenomenon and then describing the various observed scores. If repeated observed
scores are consistent, it is assumed that the measurement error is small and that the observed
scores closely approximate the true score (9). The measurement device and procedures are
declared reliable, and one of the major pitfalls of clinical research, measurement error, has
been overcome.
Measurement error is often categorized as occurring either randomly or
systematically in an experiment. Random errors are inconsistent discrepancies that occur by
chance in a study. They are not found to follow any pattern that could introduce bias into
the results; they are simply naturally occurring events that detract from the precision of
26
clinical measures (10). If the researcher is inexperienced with the measurements to be taken
and is uncertain about his/her judgments, the possibility of random error is introduced.
Research procedures often include repeated trials for measurements to decrease these types
of random errors so an average of several trials may be entered for the subject's score. This
method will provide a score that will more closely approximate the subject's true score than
does any one trial score (10). Consistent errors that persist from one subject to another are
considered systematic errors.
Both random error and systematic error will undermine the validity of the clinical
measure (6). Systematic error is of particular concern since its effect on reliability can go
undetected; thus, a clinician may assume the clinical measure is reliable and proceed with
its use. In the study example, range-of-motion measurements were taken by placing a
precision protractor on a video monitor's screen and measuring both beginning and ending
angular measurements. If the numbers marked on the protractor were in error by three
degrees, then all measurements made with that protractor would be off by three degrees in
the same direction. One can see that this systematic error would not affect the reliability of
the measurements taken by the investigators, but statements about the average amount of
axial rotation permitted by an orthosis would carry with it the error of three degrees. This
illustrates the intimate relationship between reliability and validity and the influence of
measurement error on both of these important characteristics of measures.
As quickly becomes obvious, not all errors can be completely eliminated from
clinical investigations. Attempts to control sources of errors in measurement and procedures
of research are not unlike the tension between internal and external experimental validity
discussed by Lunsford (11). The application of too great an effort to control error may
enhance reliability but somewhat decrease validity. Reasonable efforts to assess
measurement properties should be expected of those conducting clinical investigations.
27