You are on page 1of 9

Experimental Uncertainties

Identifying sources of uncertainty

The results of measurements can be influenced in many ways. Uncertainties can arise from
instrumental limitations, such as the resolution of the scale or the accuracy of its calibration,
external influences such as temperature fluctuations, variations in the system itself, and disturbance of the system due to the measuring process, for example putting a cold thermometer
in a hot fluid. Below are some of the basic categories. As a general rule, you always need to
think at least about resolution (how accurate your scale is) and repeatability (by how much
the result changes if you repeat the measurement).
1. Resolution: There will always be an uncertainty due to the limited information you can
extract from a measuring instrument. For an analog instrument, the uncertainty due to
resolution is usually half the smallest division on the scale, as you are usually capable
of rounding the measurement to the nearest graduation. For a digital instrument, the
uncertainty is best taken as 1 in the last digit, because you dont know whether the
instrument rounded or truncated the later digits.
2. Repeatability: If you repeat an experiment, you will most likely get a different value,
even if you think you have repeated it in exactly the same way (unless the experiment
is very simple or your precision is not that great). Quite often these variations can be
called noise: this is where the reading (for example, on a voltmeter, or on an oscilloscope)
changes quickly and randomly when nothing is changing in the experimental set-up and
conditions. In some cases the signal-to-noise ratio is so large that you are unaware of the
noise, but it will be present in most of the experiments you do.
To estimate the size of the uncertainty due to noise, it is best to take several measurements and average them. For a small number of measurements, a simple estimate of the
associated uncertainty is half the range of measurements. For example, if you measure
voltages of 1.25 V, 1.35 V, 1.30 V and 1.33 V, for which the average is 1.31 V with a
range of 0.1 V, a reasonable estimate of the uncertainty is 0.05 V. Another method
would be to use an uncertainty that covers the spread of measurements from the mean:
in this case, 0.06 V.
If you take more than about 10 measurements, you can use statistical methods to estimate the uncertainty from the range of measured values. For a set of n measurements
x1 , x2 , . . . xn , first calculate the mean x and the standard deviation of the measurements.
The standard deviation is calculated as
sP

n
i=1 (xi

x)2
n1

It is usually easiest to use a computer or calculator with a built-in function such as


Excels STDEV. Make sure you use the formula for the sample standard deviation, not
the population standard deviation, since you are sampling from a population of possible
measurements you have not made every repeat measurement possible. (The formula
for the population standard deviation has (n) instead of (n 1) is the denominator.) The
standard deviation indicates how big the scatter of the measurements is around the mean.
For uncorrelated measurements, the standard error in the mean is obtained from the
experimental standard deviation by

(
x) =
(1)
n

This is a measure of how well the mean has been determined by your repeat measurements
and is the uncertainty you would quote on your experimental result. Note the dependence
on n: the more measurements you take, the smaller the uncertainty.
Assuming we are dealing with a Gaussian or normal distribution of measured values, the
true (asymptotic as n ) value has a 68% chance of falling within (
x) of the mean
of your measurements. Uncertainties in science are usually quoted at a 68% level.
We also use the standard error in the mean to examine outliers or anomolous readings
we expect 95% of our data to fall within 2(
x) of the mean or expected value, so if a
measurement falls more several (
x) away from the mean, it is often orth re-measuring
or considering whether it follows the pattern we attributed to the data.
3. Calibration: No instrument is perfect; it always has to be checked or calibrated against
another instrument. (That instrument in turn also must be checked, leading in a chain
all the way back to the international reference standards.) Though this information
is not usually available in undergraduate labs, you will sometimes be able to calibrate
your equipment yourself. For example, in the photo-electric effect experiment, you can
calibrate the wavelengths being passed through the monochromator using the standard
filters provided, and you can calibrate zero on the picoammeter using the zero check and
zero adjustment. Similar you can try to think of ways of ensuring the thermometers used
to measure the solar house temperatures are properly calibrated (e.g. by measuring the
temperature of ice). Assuming a device behaves linearly, a two point calibration is often
sufficient to check that it is calibrated properly.
4. Disturbances due to the measuring apparatus: The act of measuring can influence the
quantity under measurement. Like the external influences, if you can correct for this, do
so, otherwise try to quantify the uncertainty this is introducing into your measurement. A
simple example is a temperature measurement: when you lower a cold thermometer into a
hot liquid, the immediate temperature reading will be lower than the initial temperature
of the liquid. Waiting for the system to reach thermal equilibrium is one way of minimizing
the disturbance in this case.
5. External influences: The environment temperature, pressure, humidity, vibrations,
background illumination in which the experiment is conducted will affect the measurements obtained. The first step in accounting for such effects is to estimate by how
much a change in the external parameter affects your measurement, either by theoretical
considerations or experimental tests. For example, if you are measuring electromagnetic
spectra, you can test whether the background illumination in the room has a serious
impact on your data by taking a measurement with the spectrograph probe covered. In
some cases you may be able to quantify these effects, but often you cant and so you will
just need to make a sensible decision about whether such factors are negligible, or what
order of magnitude they might contribute to your uncertainties.

Propagation of uncertainties

We denote the uncertainty in a quantity x1 by x1 . So for a length (20.5 0.5) cm, we


have x1 = 20.5 cm and x1 = 0.5 cm. We can also write uncertainties in the following form
x1 = 20.5(5), where the number in parentheses indicates the uncertainty in the last digit such
notation is common in physics research.
If your desired answer is a scalar multiple of the value you measure (for example, you
measure a photon frequency f and want to relate it to the photons energy through E = hf
where h is Plancks constant), then the uncertainty on the final result is simply the same scalar
multiplied by the uncertainty on your measurement: that is, E = hf .
If you identify more than one source of uncertainty, you may have to combine them. Say
we measure quantities x1 , x2 , x3 ... which have associated uncertainties x1 , x2 , x3 ... We
will assume that the measurements are uncorrelated, so that the outcome of a measurement of
one quantity has no affect on the measurement of the others. (If they are correlated, things get
a little more compliacted).
We then combine the results of our measurements to produce a final answer z = f (x1 , x2 , x3 ).
The general rule for combining uncertainties is:
2

(z)

= (x1 )
=

z
x1

n
X

(xi )

i=1

!2

z
x2

+ (x2 )

z
xi

!2
2

+ (x3 )

z
xx3

!2

+ ...

!2

(2)

However, you should always try to make your life as easy as possible: before you start
combining uncertainties, look at the relative contributions each uncertainty is going to make
to the uncertainty on you final result. If you are adding two (or more) numbers together,
compare the absolute sizes of the uncertainties: if one is a lot larger than the other, it is often
sufficient to just use that. If the numbers are combined in any other way, compare the fractional
uncertainties: if one quantity has a fractional uncertainty of 1/10 or 10%, while the other has
a fractional uncertainty of 1/100 or 1%, your final answer will have a 10% uncertainty. In such
cases, where one source of uncertainty dominates the others, you do not need to do a complex
error propagation.

2.1

Common cases

Make sure you can see how all of the following examples come from equation 2.
Addition or subtraction:
If z = x1 + x2 or z = x1 x2
z =

(x1 )2 + x2 )2

Multiplication or division:
If z = x1 x2 or z = x1 /x2 ,


z
z

2

x1
x1

2

Powers: If z = xn1 ,
z
x1
=n
z
x1

x2
x2

2

The least-squares model and least-squares fitting

The least-squares model assumes that the data you obtain is a function of one or more variables
(which you need to estimate) plus scatter that is, that data is signal plus noise.
For example, if you make a series of measurements of some quantity m, we assume that the
result of your ith measurement, di , is equal to m + i , where i is a measure of the noise. The s
are often referred to as residuals. For example, in one experiment you may measure the mass
of 6 nominally identical 50 g masses, which you find to be 49.9, 50.0, 50.1, 50.5, 49.3 and 49.7
grams. Each of these provides an equation:
d1
d2
d3
d4
d5
d6

= 49.9 = m + 1
= 50.0 = m + 2
= 50.1 = m + 3
= 50.5 = m + 4
= 49.3 = m + 5
= 49.3 = m + 6

So we have 6 equations and seven unknowns, m (the average mass of the whole population of
masses) and the residuals.
The least-squares method takes the sum of the squares of the residuals
Q=

(di m)2

X
i

and minimizes it with respect to m i.e. it calculates dQ/dm and finds the minimum,
dQ X
=
2(di m) = 0
dt
i
This gives an estimate of the mean mass of all of the masses, not just the ones we measured.
To find the variance and hence the uncertainty we take the sum of the squares of the residuals
and divide by the number of degrees of freedom in this case, there are 6 residuals, but they
are constrianed by the fact that they have to sum to zero, so there are 5 degrees of freedom:
P

i (di

m)2

Youd quote your result as m .


The least-squares approach can be extended to deal with situations where more than one
variable needs to be determined. For data showing a linear trend that can be described in a
form
y = mx + c
we define a quantity
D=n

n
X

x2i

n
X

i=1

!2

xi

i=1

which should look vaguely familiar ...


Then
Pn

c=

i=1

yi

Pn

i=1

Pn

x2i
D

i=1

xi y i

Pn

i=1

xi

and
m=

Pn

i=1

x i yi

Pn

i=1

Pn

x2i
D

i=1

xi

Pn

i=1

yi

The residuals are calculated as


i = yi c mxi
and the root-mean-square error as
2 =

sP

n
2
i=1 i

n2

The number of degrees of freedom is now = n 2 because there are two constraints on the
data, the two parameters m and c. Note that is NOT an uncertainty in the slope m! To find
the uncertainty on the slope and the intercept we have to go a step further,
sP

c =

n
i=1

D
r

m =

3.1

x2i
n
D

Weighted least-squares fitting

The previous section assumed that the uncertainties on each individual measurement were the
same. However, the least-squares approach can be further extended to take into account uncertainties on individual measurements. Essentially this is done by assigning a weight to each
measurement such that measurements with large uncertainties contribute less to the minimization process than measurements with small uncertainties:
yi Wi yi ,
xi Wi xi ,

1
2
y,i
1
Wi = 2
x,i
Wi =

Lets consider a set of independent values xi


This is what the Mathematica template you were given in the first year lab does for you!
Some of you will also have calculators that can do this, and Excel has a built-in function - but
be careful to check whether you are using an unweighted least squares fit or a weighted one,
since simply plotting the error bars on the data is not enough to make excel pay attention to
them.

Testing a hypothesis

In many of the 2nd year lab experiments, you will have a hypothesis concerning how you expect
the data to behave for example, that the stopping voltages and frequencies in the photoelectric
effect experiment should obey the relationship
eV = hf

(3)

That is, you believe that the stopping voltage is linearly-related to the frequency of the incident
light through the above equation, with accepted values of e, h and . You can use a leastsquares fit to obtain values of h and from your data, but it is also often worth evaluating how
likely it is that your hypothesis is correct, i.e. that the data really are described by the theory
you think theyre described by.
The first step you should take in testing a hypothesis is to make sure you know what your
hypothesis is and what your associated assumptions are. In this example, our assumptions
might be
1. that energy of a photon depends in some way on its frequency;
2. that the energy of a photon is transferred to an electron in the metal of the photocathode;
3. that some of the energy is used up in the process of the electron escaping from the
metal, and that this amount is a constant characteristic of the material;
4. that the charge on the electron is what we expect (i.e. a constant and equal to 1.610 19
C); and
5. that our equipment allows us to isolate particular frequencies of light;
6. that we know what those frequencies are!
There may be other assumptions underlying thet way we analyse our data, but you get the
idea. If we take all these assumptions as reliable, our experiment provides a means of testing
whether the energy depends linearly on the frequency and of determining a value for Plancks
constant which can be compared to the accepted value.
To test the hypothesis that equation 3 correctly describes the situation, we compare the
result of our weighted least-squares fit to a standard probability distribution. One of the most
common ways of doing this is by using the 2 statistic.

4.1

2 goodness-of-fit test

If the observations we make have no associated uncertainty (as may be the case in some statistical experiments), we calculate the test statistic (the 2 ) as:
2

n
X

(Oi Ei )2
Ei
i=1

where Oi is the observed result and Ei is the expected or theoretical result. You may be
worried by the idea of observations with no uncertainty, but think about the following (classic)
situation. The number of deaths per year by horse- or mule-kick were recorded in 10 Prussian
army corps over 20 years. These numbers were exact: if no such deaths occurred, the number
was zero. If one such death occured, the number was exactly one. No uncertainties are attached
to such records.

On the other hand, if the observations we make do have associated uncertainties (as is more
often the case in the lab), we calculate the test statistic (the 2 ) as:
2 =

n
X

(Oi Ei )2
i2
i=1

where i is the uncertainty on the ith observation.


This statistic can then be compared to the 2 distribution if the number of degrees of freedom
is known. The number of degrees of freedom, , is the number of values in the calculation of
the statistic that are free to vary. This is equal to the number of data points entering into
the calculation (n) minus the number of parameters in the fit p. In the photo-electric effect
example, this is equal to n 2, since the straight line fit incldues a calculation of both the
intercept and the slope.
2 distributions are tabulated and available online for a variety of different degrees of freedom. However a common approach in physics and many other fields is to calcualte the 2
statistic per degree of freedom,
2 / =

4.1.1

1
2
np

Example

In 1898, von Bortkiewicz published a now famous study on the occurrence of deaths by horseor mule-kick in 10 Prussian army corps over a period of 20 years. The data can be summarised
as follows:
Number of deaths
Frequency

0
109

1 2 3 4
65 22 3 1

We believe that such events should be governed by Poisson statistics, so we may make a
test to determine whether they follow a Poisson distribution,
P (x) =

x
e
x!

where is the mean of the distribution and P (x) is the probability of measuring a frequency
x.
We can use a 2 goodness-of-fit test to see whether the horse- and mule-kick deaths are
consistent with a Poisson distribution. First, we calculate the mean number of deaths per year
and find it is 0.61.
Then we calculate the expected frequency with which each number of deaths is observed
assuming a Poisson distribution,
N (x) = 200P (x)

(0.61)x 0.61
e
x!

Then we find the difference between the observed and expected values.
Number of deaths
Ei
Oi
Oi Ei

0
109
109
0

1 2 3 4
66 20 4 1
65 22 3 1
1 2 1 0

Now we calculate the 2 statistic, obtaining 2 = 0.448.


In this case we have five data points and one calculated parameter (the mean of the distribution), giving four degrees of freedom. This gives us a 2 per degree of freedom of 0.112. We then
look up or calculate a P -value that is, the probability that we would have obtained a 2 higher
than this if our hypothesis (that the data are described by a Poisson distribution) is correct.
There are online calculators such as the one at http://stattrek.com/Tables/ChiSquare.aspx
which you can use (just try googling chi-squre distribution calculator if that link is defunct).
A table of P -values for a 2 distribution with one degree of freedom is included at the end of
this document.
We find the probability of getting a larger 2 statistic is 98%! This is equivalent to saying
that a much larger deviation (and hence larger 2 statistic) would not have been unlikely. In
fact, the data are spookily clos to a perfect Poisson distribution, with no noise or randomness
... but also (apparently) genuine.

4.1.2

Example

In a particular lab, you measure the extension of a metal wire as a function of applied load and
obtain the following data:
Load (kg), 0.001 kg
Extension (mm), 0.2 mm

0.100 0.200 0.300 0.400 0.500


0.5
1.1
1.6
2.3
3.4

Your hypothesis is that the stress applied to the wire has not exceeded the proportional
limit, so that the data should obey Hookes law and the strain is proportional to the stress as
l
F
= Y
A
l0
Al0
l =
F = F
Y
If we assume that the cross-sectional area A does not change as the wire stretches, we can
calculate a 2 statistic to test the goodness of fit of our straight line hypothesis.
First, we perform a linear regression to find the constant of proportionality . Although we
have two sets of uncertainties, the fractional uncertainty on the applied load is much smaller
than that on the measured extension, so we can neglect the uncertainty on the load. In fact,
since the uncertainties are the same on all data points we can get away with ignoring them
altogether for the step of doing the linear regression.
A linear regression on l vs F gives a slope of = 0.0007143. But what is the 2 value i.e., was the straight line fit a sensible thing to do in the first place?
Calculate the 2 statistic for this example for yourself - use a spreadsheet to make it easier.
You will find that the 2 is significantly greater than 1.
How many degrees of freedom are there in this problem? Remember the linear regression
gives values for both the slope and the intercept that is, two parameters come out of the fit
...
The 2 per degree of freedom is still significantly greater than one. This tells us that
this is not a good fit, so some of our assumptions must be wrong. One possibility is that
our uncertainty estimates are too small - but if we want to mess with them, we must have
a good reason. Maybe we didnt take parallax into account, or maybe were just extremely
incompetent. Unless you can actually justify your actions, never mess with your
uncertainty estimates just to make a fit seem better.

What else could be wrong? Well perhaps the data dont actually follow a straight line at
all. We know that if you stretch a wire enough, it will exceed its elastic limit at some applied
stress, planes of molecules in the metal start to slide past each other and the extension rapidly
increases. Perhaps were seeing the onset of that with the last point. To test this hypothesis,
remove the point, redo the linear regression and goodness of fit test, and see what the value of
2 is now ...

You might also like